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(54) Image processing apparatus 



(57) In an apparatus and method for creating a 
three-dimensional model of an object, images of the ob- 
ject taken from different, unknown positions are proc- 
essed to identify the points in the images which corre- 
spond to the same point on the actual object (that is 
"matching" points), the matching points are used to de- 
termine the relative positions from which the images 
were taken, and the matching points and calculated po- 
sitions are used to calculate points in a three-dimension- 
al space representing points on the object. A number of 
different techniques are used to identify the matching 
points, and a number of solutions are calculated and 
tested for the relative positions, the solution which is 
consistent with the largest number of matching points 
being selected. In one matching technique, edges in an 
image are identified by first identifying corner points in 
the image and then identifying edges between the cor- 
ner points on the basis of edge orientation values of pix- 
els, the edges are processed in strength order to remove 
cross-overs, the images sub-divided into regions by 
connecting points at the ends of the edges on the basis 
of the edge strengths, and matching points within cor- 
responding regions in two or more images are identified. 
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Description 

[0001] The present invention relates to an image processing apparatus and method 

[0002] Creating three-dimensional computer models of a real-life object has traditionally been time consuming and 

tTmTnlT T, ,nn9 .f P r° nne ' and/0f SXPenSiVe SqUipment °" e reaSon for this * s ^ « » necessary " o de 

* ^ k P °' P ° ,n,S 0,1 thS 0bieCt 38 We " as lhe relative P° si,ions ° f ,he P°'"<s in two dimensions 
[0003] Known techniques for creating three-dimensional computer models from real objects fall into one of two cat- 
egory, namely active techniques in which depth information is obtained by actively sensing the surface of the object 
and passive techniques in which depth information is obtained from images of the object 

[0004] Examples of active techniques include scanning the object with a pulsed laser beam and measurino the 
detect™ time ot pulses relative to their transmission time to determine depth information (as in a laser "rangennder"V 
and touching the object at a number of points on its surface with a position-sensitive probe 

[0005] In conventional passive techniques, at least two images of the object taken from different camera positions 
are needed. To construct a three-dimensional computer model from the images, it is necessary to know firstly the 
ocat.cn in each image of points which represent the same actual point on the object, and secondly the relative positions 
from wh ,ch the images were taken. These are particularly onerous requirements. As a result, in known passive systems 
distinguishing marks/calibrations are added to the object or its surroundings to enable matching points to be easily 
identified in the images, and/or the images are taken from known camera positions 

[0006] For example, WO-A-90/10194 discloses a system for measuring strain distribution in an object using the 
hree-dirr.ens.onal coordinates of points on the object surface calculated from two images of the object. To facilitate 
the matching of po.nts in the images, a uniform square grid pattern is applied 1o the object before the images are taken 
by e ectrochemical etching or silk screening. Corresponding points of intersection of lhe grid lines can then be easily 
identified in the images. Further, the object is placed on a rotary table which is rotated by a known angle belween 
images, thereby defining the relative camera positions. 

[0007] US 4803645 discloses a system for determining three-dimensional coordinates of poinls on an object in which 
a grid or similar periodic pattern is projected onto the object using a light projector, and images ol the object are taken 
using three imaging systems at fixed, predefined positions. 

[0008] WO-A-88/02518 discloses a system for producing a depth map of an object from a plurality of images taken 
from imaging devices set in a predefined, known configuration. Similarly, US 53071 36 discloses an automobile distance 
detect.cn system which determines the range of an automobile using images taken from a plurality of cameras mounted 
in a predefined, known configuration in the user's automobile 

^HLf 8 "?" 22 ^ 21 ' WO ; A - 92/06444 and U S 4695156 all disclose systems for determining three^imensional 
™<« . P 8 ° n a " |eCt SUrface from stereo ima 9 es of ,he ob i ect taken at known camera positions 
[0010] In many cases, however, it is inconvenient, expensive, and/or infeasible to provide a reference grid or mark- 
ings/calibrations on the object or its surroundings, or to take images from known relative positions. Reliable and ac- 

rnnf, 6 , f " reqUired ,0 p0ints in the ima «> es and/or to calculate the relative camera positions 

[0011] Even if reference markings are used, it would be desirable to identify other matching points in the images to 
give more pom s with which to calculate the camera positions. This would enable the camera positions to be calculated 
more accurately and/or would allow fewer reference markings to be used on the object or its surroundings 
[0012] Further, even il malching points are identified using reference features, an accurate and reliable technique 
for determ.n.ng the positions at which the images were taken is necessary if the images were not taken from known 
relative positions. 

[0013] The present invention aims to address one or more of the above problems, and aims to provide an image 
processing apparatus and method for determining matching features in images of an object and/or an apparatus and 
method for calculating the positions at which the images were taken. 

[0014] The present invention provides an image processing apparatus or method in which image data for a plurality 
of images of an object is processed without using prior information on the relationship between the positions from which 
the images were taken to identify corresponding object features in the images. Matching features are identified using 
a first technique, the relationship between the images is determined and its accuracy tested, and if the accuracy is 
not sufficient, user-identified features are used to identify matches with a second technique 

[001 5] The present invention provides an image processing apparatus or method in which image data for a plurality 
of images of an object is processed without using prior information on the relationship between the positions from which 
the .mages were taken to kJentify corresponding object features in the images. The following steps are iteratively 
performed unt.l a des.red accuracy is achieved: (i) user-identified features are used to identify further matchingf eatures 
and (ii) the accuracy of the further identified features is determined. 

££!»!!„ P I eferabl )'' ' he acc u urac y of the matches is determined by calculating the relationship between the imaging 
positions. Signals defining this relationship are then also produced. 

[0017] According to the present invention, there is provided an image processing apparatus or method in which 
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image data for a plurality of images of an object is processed without using prior information on the relationship between 
the positions from which the images were taken to identify corresponding object features in the images. Matching 
features are identified using a first technique, the relationship between the images is determined, and further matches 
are identified using a second technique together with the determined relationship. 
s [0018] Preferably, the first technique includes a user identifying features, and the second technique includes the 
image processing apparatus identifying features. 

[0019] The present invention provides an image processing apparatus or method in which image data for at least 
three images of an object is processed without using prior information on the relationship between the positions from 
which the images were recorded, to determine the relationship. Matching features in first and second images are 
10 identified and used to determine the positional relationship between these images. The positional relationship is used 
to identify at least one additional match in the first and second images, at least one of the additional matches is then 
matched in a third image and the positional relationship of the third image is determined. 

[0020] The present invention also provides an image processing apparatus or method in which this process is adapt- 
ed if corresponding object features in a pair of images are already known, or it the positional relationship between a 

15 pair of images is already known. 

[0021] The present invention provides an image processing apparatus or method in which image data for a plurality 
of images of an object is processed without using prior information on the relationship between the positions from which 
the images were taken to identify corresponding object features in the images. Each image is notionally split into regions 
on the basis of matches defined in input signals, and the mapping of regions between images is determined and used 

20 to identify further matches. 

[0022] Embodiments of the invention will now be described by way of example only with reference to the accompa- 
nying drawings, in which: 

[0023] Figure 1 schematically shows the components of an image processing apparatus in an embodiment of the 
invention. 

25 [0024] Figure 2 illustrates the collection of image data by imaging an object from different positions around the object. 
[0025] Figure 3 shows, at a top level, the processing operations performed by the image processing apparatus of 
Figure 1 in an embodiment of the invention. 

[0026] Figure 4 shows the steps performed during initial data input at step S2 in Figure 3. 

[0027] Figure 5 illustrates the sequencing of images by a user at step S22 in Figure 4. 
30 [0028] Figure 6 shows the relationship between the operations in Figure 1 of initial feature matching at step S4, 

calculating camera transformations at step S6 and constrained feature matching at step S8. 

[0029] Figure 7 shows in greater detail the relationship between the operations shown in Figure 6. 

[0030] Figure 8 shows the operations performed during automatic initial feature matching across the first pair of 

images in a triple of images at step S52 in Figure 7. 
35 [0031] Figure 9 shows the operations performed during automatic initial feature matching across the second pair of 

images in a triple of images at step S54 in Figure 7. 

[0032] Figure 10a and Figure 10b schematically illustrate a "perspective" image and an "affine" image, respectively. 
[0033] Figure 1 1 shows, at a top level, the operations performed during affine initial feature matching for the first (or 
second) pair of images in a triple of images at step S62 or step S64 in Figure 7. 
40 [0034] Figure 1 2 shows the operations performed in finding the edges in each image of a pair of images at step S 1 00 
in Figure 11 . 

[0035] Figure 13 illustrates the pixels which are considered when calculating edge strengths at step S106 or step 
SI 08 in Figure 12. 

[0036] Figure 14 shows the operations performed when calculating edge strengths at step S106 and step S108 in 
45 Figure 12. 

[0037] Figure 15 shows the operations performed when removing edges which cross over other edges at step S112 
in Figure 12. 

[0038] Figure 1 6a, Figure 1 6b and Figure 1 6c show examples of two edges, Figures 1 6a and 1 6b showing examples 
in which the edges do not cross, and Figure 16c showing an example in which the edges do cross. 
so [0.039] Figure 17 shows the operations performed when triangulating points at step S102 in Figure 11. 

[0040] Figure 1 8 shows the operations performed when calculating further corresponding points in a pair of images 
at step S 1 04 in Figure 1 1 . 

[0041] Figure 19 illustrates the use of a grid of squares at steps S162, S174 and S 180 in Figure 18. 
[0042] Figure 20 shows, at a top level, the operations performed when calculating the camera transformations for a 
55 triple of images at steps S56 and S66 in Figure 7. 

[0043] Figure 21 shows, at a top level, the operations performed when carrying out processing routine 1 at step S202 
in Figure 20. 

[0044] Figure 22 shows the operations performed when setting up the parameters at step S206 in Figure 21 . 



3 



EP 0 898 245 A1 

Si, SeVa 3 ° Perali ° nS Pert ° rmed " * ,erminin8 ,he number of i,era,ions to "e carried out at step 

I^r^uretr ^ ° Pera,i0nS P6rf0med ° Ut a « Ve calculatlon^mage pair at 

» S242 In ^24 " ° Perati ° nS °« an afflne for an image pair at step 

fn ^TlFaS^SZXT™ P9rf ° rrned Wh9n Ca ' CUb,in9 ^ — * a„ three images 

™fll ^ i9Ure 29 Mlus,rates ,he scale , and the rotation angles P 1 and P 2 tor the three imaaes in a mole 
« S 5 6?; a r2r Sthe ° Pera,iOnSPeri0med '-'-^sand^orpt and/or S^sS^, S354 

222 I 11 operat,ons Performed when calculating the best scale at step S382 in Figure 30 

20 K5L 32 Ure 34 Sh ° WS ° Perati0nS Perf ° rmed '° ' eSt the Calculated scale a « '"pie points at step S404 

S. 11 " 36 i,,US,ra,eS Pr ° ieCti ° n ° f ^ <0r P ° in,S " ,he ° U,Side ima 9 es of a tri P' e ° f '-ages at step S426 

el kct:s , kj!?is^ ^ readin9 ex,stins parameters and set,in9 up p — 

HpleTrS^S" P9rt0rmed Wh9n Ca ' CUla,in9 ^ Cam9ra * a.- three images 

30 sss ^Xs74^ g r 7 the operations carried oui when perf ° rmin9 c ° ns,rained f — 

as mn^2 c' 9Ure to S u° WS ' 91 9 tOP ,6Vel ' the °P era,ions Performed when generating 3D data at step S10 in Figure 3 

S J? U h ° Perati ° nS Pert ° rmed Wh6n Ca ' CU,a,in9 the 30 P r °i" ction ° f Points wrthin each use 3 

S f ° t ST Wh ' Ch ,0rmS Part °' a tfip,e Wi,h a ^sequent image at step S520 in Figure 41 

£5 flve?r g es S reSU " S Whe " S,8P S52 ° " Fi9Ure 41 haS b " n Per,0rmed for a ™«*.r of points 

theTnor ZTZT^ZV^ ° pera1i0nS performed in ^"'^"a ™* discarding inaccurate 3D points and calculating 
tne error for each pair of camera positions at steps S522 in Figure 41 

[0067] Figures 45a and 45b illustrate the shift calculated at step S556 in Figure 44 between 3D points for a given 
pair of camera positions and corresponding points for the next pair of camera positions 

EES JSf! 46 '""T 68 !° rreCted 3D P ° in,S f ° r the nSXt pair of camera P° sitions wtlich ^sult after step S566 
rnn™ c been performed, and the corresponding points for the current pair of camera positions 

^' 9Ure lllus,ra,es a nu ™°er of points in 3D space and their associated error ellipsoids 



40 



45 



in 



rnn7ni _ . • 1 «i.vj u.cii aoouujdiwa error euipsoias. 

9WS St ! PS performed wnen che cking whether combined 3D points correspond to unique 
image points and merging ones that do not at step S528 in Figure 41 . 

[0071] Figure 49 shows the operations performed when generating surfaces at step S12 in Figure 3 
« ™™ T l9Ure 50 Sh ° WS thS S16PS P e rf orrrled when displaying surface data at step S14 in Figure 3 

So, ° er " b ° dimen * which wi " now be described, the object data representing the tridimensional model of 
the object recreated from the two-dimensional photographs is processed to display an image of the object to a user 

!Z,r7 r VleW '? 9 d ' reC,i0n ThS ° bjeCt data may ' hOWeVer ' be processed in man V °ther ways for different 
applications. For example, the three-dimensional model may be used to control manufacturing equipment to manu- 
facture a model of the object. Alternativeiy, the object data may be processed so as to recognise the object, for example 

obiecXrr T P T°:°? T ^ 3 databaSe - ThS data may a,S ° be processed t0 make measurements on the 
™m f; V be par * ,Cularly advantageous where measurements can not be made directly on the object itself for 
example, ,f t would be hazardous to make such measurements - if the object was radioactive for example The th ee- 
d.mens,onal model may also be compared with Ihree-dimensional models of the object previously generated to deter- 
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mine changes therebetween, representing actual physical changes to the object itself. The three-dimensional model 
may also be used to control movement of a robot to prevent the robot from colliding with the object. Of course, the 
object data may be transmitted to a remote processing device before any of the above processing is performed. In 
particular, the object data may be provided in virtual reality mark-up language (VRML) format for transmission over the 
f Internet. 

[0074] Figure 1 is a block diagram showing the general arrangement of an image processing apparatus in an em- 
bodiment. In the apparatus, there is provided a computer 2, which comprises a central processing unit (CPU) 4 con- 
nected to a memory 6 operable to store a program defining the operations to be performed by the CPU 4, and to store 
object and image data processed by CPU 4. 

10 [0075] Coupled to the memory 6 is a disk drive 8 which is operable to accept removable data storage media, such 
as afloppy disk 1 0, and to transfer data stored thereon to the memory 6. Operating instructions for the central processing 
unit 4 may be input to the memory 6 from a removable data storage medium using the disk drive 8. 
[0076] Image data to be processed by the CPU 4 may also be input to the computer 2 from a removable data storage 
medium using the disk drive 8. Alternatively, or in addition, image data to be processed may be input to memory 6 

is directly from a camera 1 2 having a digital image data output, such as the Canon Powershot 600. The image data may 
be stored in camera 1 2 prior to input to memory 6, or may be transferred to memory 6 in real time as the data is gathered 
by camera 1 2. Image data may also be input from a conventional film camera instead of digital camera 1 2. In this case, 
a scanner (not shown) is used to scan photographs taken by the camera and to produce digital image data therefrom 
for input to memory 6. In addition, image data may be downloaded into memory 6 via a connection (not shown) from 

20 a local database, such as a Kodak Photo CD apparatus in which image data is stored on optical disks, or from a remote 
database which stores the image data. 

[0077] Coupled to an input port of CPU 4, there is an input device 14, which may comprise, for example, a keyboard 
and/or a position sensitive input device such as a mouse, a trackerbalL etc. 

[0078] Also coupled to the CPU 4 is a frame buffer 1 6 which comprises a memory unit arranged to store image data 
26 relating to at least one image generated by the central processing unit 4, for example by providing one (or several) 
memory location(s) for a pixel of the image. The value stored in the frame buffer for each pixel defines the colour or 
intensity of that pixel in the image. 

[0079] Coupled to the frame buffer 16 is a display unit 18 for displaying the image stored in the frame buffer 16 in a 
conventional manner. Also coupled to the frame buffer 16 is a video tape recorder (VTR) 20 or other image recording 
30 device, such as a paper printer or 35mm film recorder. 

[0080] A mass storage device, such as a hard disk drive, having a high data storage capacity, is coupled to the 
memory 6 (typically via the CPU 4), and also to the frame buffer 16. The mass storage device 22 can receive data 
processed by the central processing unit 4 from the memory 6 or data from the frame buffer 1 6 which is to be displayed 
on display unit 18. 

35 [0081] The CPU 4, memory 6, frame buffer 16, display unit 18 and the mass storage device 22 may form part of a 
commercially available complete system, for example a workstation such as the SparcStation available from Sun Mi- 
crosystems. 

[0082] Operating instructions for causing the computer 2 to perform as an embodiment of the invention can be sup- 
plied commercially in the form of programs stored on floppy disk 10 or another data storage medium, or can be trans- 
40 mitted as a signal to computer 2, for example over a datalink (not shown), so that the receiving computer 2 becomes 
reconfigured into an apparatus embodying the invention. 

[0083] Figure 2 illustrates the collection of image data for processing by the CPU 4. 

[0084] An object 24 is imaged using camera 12 from a plurality of different locations. By way of example, Figure 2 
illustrates the case where object 24 is imaged from five different, random locations labelled L1 to L5, with the arrows 
45 jn Figure 2 illustrating the movement of the camera 12 between the different locations. 

[0065] Image data recorded at positions L1 to L5 is stored in camera 12 and subsequently downloaded into memory 
6 of computer 2 for processing by the CPU 4 in a manner which will now be described. In this embodiment, CPU 4 
does not receive information defining the locations at which the images were taken, either in absolute terms or relative 
to each other. 

so [0066] Figure 3 shows the top-level processing routines performed by CPU 4 to process the image data from camera 

12. ' 

[0087] At step S2 ; a routine for initial data input is performed, which will be described below with reference to Figures 
4 and 5. The aim of this routine is to store the image data received from camera 12 in a manner which facilitates 
subsequent processing, and to store information concerning parameters of the camera 12. 
55 [0088] At step S4, initial feature matching is performed to match features within the different images taken of the 
object 24 (that is, to identify points in the images which correspond to the same physical point on object 24). This 
process will be described below with reference to Figures 6 to 19. 

[0089] At step S6, the transformations between the different camera positions from which the images were taken 
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<L1 to L5 in Figure 2), and hence the positions themselves in relative form, are calculated using the points matched in 
the images, as will be described below with reference to Figures 20-38. , 
[0090] At step SB, using the calculated camera transformations from step 'SB. further features are matched in the 
images (the calculated camera transformations being used to calculate, that is "constrain", the position in an image in 
T^S^ZSST matChinQ 3 9iV6n P ° lnt in an0,h6r imaQ6) - This process wi " be desc nbed below with reference 

[0091] At step SI 0, points in a three-dimensional modelling space representing actual points on the surface of object 
24 are generated, as will be described below with reference to Figures 41 to 48. 

[0092] In step S12, the points in three-dimensional space produced in step S10 are connected to generate three- 
dimensional surfaces, representing a three-dimensional model of object 24. This process will be described with refer- 
ence io rig u re 49. 

[0093] In step S14, the 3D model produced in step S12 is processed to display an image of the object 24 from a 
desired viewing direction on display unit 18. This process will be described with reference to Figure 50 
[0094] Figure 4 shows the steps performed in the initial data input routine at step S2 in Figure 3 Referring to Figure 
4, at step SI 6, the CPU 4 waits until image data has been received within memory 6. As noted previously this image 
data may be received from digital camera 12, via floppy disk 10, by digitisation of a photograph using a scanner (not 
shown), or by downloading image data from a database, for example via a datalink (not shown) etc 
[0095] After the data for all images has been received. CPU 4 re-stores the data for each image as a separate 
project file in memory 6 at step S18. At step S20, CPU 4 reads the stored data from memory 6 and displays the 
images to the user on display unit 18. 

[0096] Figure 5 illustrates the display of the images to the user CPU 4 initially displays the images in the order in 
which the image data was received. Referring again to Figure 2 , images were taken from locations L1 L2 L3 L4then 
LB. Accordingly, the image data of the images taken at these locations is stored in the same sequence within camera 
12 and is received by computer 2 in the same order when it is downloaded from camera 12. Therefore as shown in 
Figure 5 : CPU 4 initially displays the images on display 18 in the same order, namely L1 L2 L3 L4 L5 
[00971 At the same time as displaying the images, CPU 4 prompts the user, for example by displaying a message 
(not shown) on display 18, to rearrange the images into an order which represents the positional sequence in which 
the .mages were taken around object 24, rather than the temporal sequence in which the images are in itially displayed 
The temporal sequence and the positional sequence may be the same. However, in the example illustrated in Figure 
2, locat.on L3 is between locations L1 and L2. The positional sequence of images around the object 24 is therefore 
L1 L3, L2, L4 and L5. Accordingly, at step S22, the user rearranges the images on display 18, for example by high- 
lighting the image taken at location L2 and dragging it to a position between the images for positions L3 and L4 (as 
indicated by the arrow in Figure 5), to give the correct positional sequence for the images 

[0098] Following this, at step S24, CPU 4 calculates the distance between the centres of the images on the display 
18 to determine the nearest neighbour(s) for each image. Thus, for example, referring to Figure 5, for the image taken 
at position L1 , CPU 4 calculates the distance between its centre and the centre of each other image, and determines 
that the nearest image is the one taken at position L3. For the image taken at position L3, the CPU 4 calculates the 
distance between its centre and each of the images taken at positions L2, L4 and L5 (the CPU already having deter- 
mined that the .mage taken at position L1 is a nearest neighbour on one side of the image taken at position L3) In this 
way, CPU 4 determines that the image taken at position L2 is the nearest neighbour of the image taken at position L3 
on its other side. The CPU performs the same routine for the images taken at positions L2, L4 and L5 
[0099] At step S26, CPU 4 stores links in memory 6 to identify the positional sequence of the images For example 
CPU 4 creates, and stores in memory 6, the links as separate entities. The data for each link identifies the image at 
each end of the link. Thus, referring to the example shown in Figures 2 and 5, CPU 4 creates four links, one having 
the images taken at positions L1 and L3 at its ends, one having the images taken at positions L3 and L2 at its ends 
one having images taken at positions L2 and L4 at its ends, and one having images taken at positions L4 and L5 at 
its ends. 

[0100] At step S26, CPU 4 also stores in the project file for each image (created at step S18) a pointer to each link 
entity connected to the image. For example, the project file for the image taken at position L3 will have pointers to the 
first and second links. 

[0101] At step S28, CPU 4 requests the user to input information about the camera with which the image data was 
recorded. CPU 4 does this by displaying a message requesting the user to input the focal length of the camera lens 
and the size of the imaging charge coupled device (CCD) or film within the camera. CPU 4 also displays on display 
18 a list of standard cameras, for which this information is pre-stored in memory 6, and from which the user can select 
the camera used instead of inputting the information directly. At step S30, the user inputs the requested camera data 
rn.ntf tS r° ne ° f ^ IISted Cameras « and at ste P S32 ' CPU 4 stor es the input camera data in memory 6 for future use' 
[0102] The processing of the image data stored in memory 6 by CPU 4 will now be described with reference to 
Figures 6 to 50. 
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[0103] Figure 6 shows, at a top level, the relationship between the routines of initial feature matching, calculating 
camera transformations and constrained feature matching performed by CPU 4 at steps S4, S6, S8 in Figure 3. For 
the purpose of these routines, CPU 4 considers images in groups of three in the order in which they occur in the 
positional sequence created at step S22 (Figure 4), each group being referred to as a "triple" of images. Thus, in the 
5 case where data for five images has been stored in memory 6 (as in the example of Figures 2 and 5), CPU 4 considers 
three triples of images (images 1-2-3, images 2-3-4 : and images 3-4-5 in the positional sequence). Within each triple 
of images, there are two "pairs" of images, namely the first and second images within the triple and the second and 
third images within the triple. 

[0104] Referring to Figure 6, at step S40 : the next triple of images is considered for processing (this being the first 
io triple, that is images 1 -2-3 in the positional sequence, the first time step S40 is performed). At step S42, initial feature 
matching is performed for the three images under consideration to match points across pairs of images in the triple or 
across all three images, and at step S44 the camera transformations between the positions at which the three images 
were taken are calculated using the points matched in step S42. The calculated camera transformations define the 
translation and rotation of the camera between images in the positional sequence, as will be described in greater detail 
is below. 

[0105] At step S46. CPU 4 determines whether the camera transformations calculated at step S44 are sufficiently 
accurate. If it is determined that the transformations are sufficiently accurate, then, at step S48, further features are 
matched in the three images using the calculated camera transformations. The feature matching performed by CPU 
4 at step S48 is termed "constrained" feature matching since the camera transformations calculated at step S44 are 

20 used to "constrain" the area within an image of the triple which is searched to identify a point which may match a given 
point in another image of the triple. If it is determined at step S46 that the calculated camera transformations are not 
sufficiently accurate, then steps S42 to S46 are repeated until sufficiently accurate camera transformations are ob- 
tained. However, as will be described below, when CPU 4 re-performs initial feature matching for the three images at 
step S42 for the first time after it has been determined at step S46 that the calculated camera transformations are not 

25 sufficiently accurate, it performs it using a second technique, which is different to the first technique used when step 
S42 is performed for the very first time. Further, in any subsequent re-performance of step S42, CPU 4 performs initial 
feature matching using the second technique, but with a different number of matched points in the images as input 
(the number increasing each time step S42 is repeated). 

[0106] At step S50, CPU 4 determines whether there is another image which has not yet been considered in the 
30 positional sequence of images, and, if there is, steps S40 to S50 are repeated to consider the next triple of images. 
These steps are repeated until all images have been processed in the way described above. 

[0107] Figure 7 shows in greater detail the relationship between the routines of initial feature matching, calculating 
camera transformations and constrained feature matching. 

[0108] Referring to Figure 7, at step S52, CPU 4 performs initial feature matching using a first technique for the first 

35 pair of images in a triple of images, as will be described below. This first initial feature matching technique is automatic, 
in the sense that no input from the user is required. At step S54, CPU 4 performs initial feature matching using the 
first, automatic technique for the second pair of images in the triple. At step S56 r CPU 4 calculates the camera trans- 
formations between the images in the triple. At step S58, CPU 4 determines whether the camera transformations 
calculated at step S56 are sufficiently accurate. If they are, constrained feature matching is performed at step S74 to 

40 match further points in the images of the triple. 

[0109] On the other hand, if is determined at step S58 that the calculated camera transformations are not sufficiently 
accurate, then CPU 4 performs initial feature matching for the triple of images using a different technique at steps S60 
to S68. In this embodiment, an "affine" technique (which assumes that the object 24 in the images does not exhibit 
significant perspective properties over small regions of the image) is used, as will be described below 

45 [0110] At step S60, the user is asked to identify matching points (that is, points which correspond to the same physical 
point on object 24) in the first pair of images of the triple and the second pair of images in the triple. This is done by 
displaying to the user on display unit 18 the three images in the triple. The user can then move a displayed cursor 
using input means 14 to identify a point in the first image and a corresponding, matched point (representing the same 
physical point on object 24) in the second image. This process is repeated until ten pairs of points have been matched 

50 in the first and second images. The user then repeats the process to identify ten pairs of matched points in the second 
and third images. It may be difficult for the user to precisely locate the displayed cursor at a desired point (which may 
occupy only one pixel) when selecting points. 

[0111] Accordingly, if any point identified by the user is within two pixels of a point previously identified in that image 
by the CPU in step S52 or S54 or, if performed previously, in step S62, S64 or S74, then CPU 4 determines that the 
55 user intended to identify a point which it had automatically identified previously and consequently stores the co-ordi- 
nates of this point rather than the point actually identified by the user on display 1 8. 

[0112] At step S62 t CPU 4 matches points in the first pair of images in the triple using the affine matching technique, 
and at step S64, it matches points in the second pair of images in the triple using this technique. As will be described 
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below, in affine feature matching, CPU 4 uses the points matched by the user at step S60 to determine the relationship 
between the images in each pair of images, that is the mathematical transformation necessary to transform points from 
one image to the other, and uses this to identify further matching points in the images. . 

[0113] At step S66, CPU 4 uses all of the points which have now been matched to determine again the camera 
B transformations between the positions at which the three images in the triple were taken, and at step S68 determines 
whether the calculated transformations are sufficiently accurate. If it is determined that the transformations are suffi- 
ciently accurate, then CPU 4 performs constrained feature matching for the three images at step S74. On the other 
hand, if it is determined that the transformations are not sufficiently accurate, CPU 4 requests the user at step S70 to 
match more points across each pair of images in the triple (that is, to identify in each image of a pair the image points 
which correspond to the same physical point on object 24). In this embodiment, the user is asked to identify ten pairs 
of further matching points in the first pair of images in the tripie of images and ten pairs of further matching points in 
the second pair of images in the triple. At step S72, the user identifies matching points in the same way as previously 
described for step S60. Again, if a user-identified point lies within two pixels of a point previously identified by CPU 4 
(either in steps S52 or S54, or in steps S62 or S64, or in step S74) then it is determined that the user intended to 
identify that point, and the co-ordinates of the CPU-identified point are stored rather than the user-identified point 
[0114] Steps S62 to S72 are repeated until it is determined at step S68 that sufficiently accurate camera transfor- 
mations between the images in the triple have been calculated. That is, the second feature matching technique (in this 
embodiment, an "affine" technique) is repeated using a different number of user-identified matching points as input 
each time, until sufficient matches are made to allow sufficiently accurate camera transformations to be calculated 
Constrained feature matching for the three images in the triple is then performed at step S74. 

[0115] At step S76, CPU 4 determines whether there is another image in the positional sequence to be processed 
If there is, steps S54 to S76 are repeated until ail images have been processed. It will be seen from Figure 7 that step 
S52 is not performed when subsequent images are considered Referring to the example illustrated in Figure 2 and 
Figure 5, there are five images of object 24 to be processed by CPU 4. Points in images 1 and 2 of the positional 
sequence are matched at step S52 (and step S62 if the second feature matching technique is used). Points in images 
2 and 3 are matched at step S54 (and step S64 if the second feature matching technique is used) As explained 
previously, .mages are considered in triples. Accordingly, when image 4 is considered for the first time, it is considered 
in the triple comprising images 2, 3 and 4. However, points in images 2 and 3 will have been matched previously by 
CPU 4 at step S54 (and step S64). Step S52 is therefore omitted, and processing begins at step S54 in which automatic 
feature matching of points in the second pair of images in the triple (that is, images 3 and 4) is performed If the 
automatic technique fails to generate sufficiently accurate camera transformations at steps S56 and S58 then the 
affine technique is performed for both the first pair of images and the second pair of images in the triple That is initial 
feature matching is re-performed for the first pair of images since the user will identify further matching points in these 
images at step S60. 

[0116] In this embodiment, constrained feature matching is performed for a given triple of images before the next 
image in the sequence is considered and initial feature matching is performed on it. As described previously the step 
of constrained feature matching produces further matching points in the triple of images being considered In fact as 
will be descnbed below, points are identified in the final image of the triple which match points which have been pre- 
viously matched in the first pair of images (thus giving points which are matched in all three images) The present 
embodiment provides the advantage that these newly matched points in the final image of the triple are used when 
performing initial feature matching on the next image in the triple. For example, when the first three images of the 
sequence shown in Figure 5 are processed, the step of constrained feature matching at step S74 identifies points in 
image 3 which match points in images 1 and 2. When CPU 4 considers image 4 and performs initial feature matching 
at step S54 (and step S64) the new points generated at step S74 are considered and processing is performed to 
determine whether a matching point exists in image 4. If a matching point is identified in image 4, the new points 
matched by constrained feature matching at step S74 and the new point identified in image 4 by initial feature matchinq 
from a triple of points and are taken into consideration when calculating the camera transformations at step S56 or 
SBB. Thus, the step of constrained feature matching at step S74 may generate points which are used when calculating 
he camera transformations for the next triple of images (that is, if the initial feature matching at step S54 or S64 for 
the second pair of images in the next triple matches at least one of the points matched across the first pair of images 
in constrained feature matching into the third image of the new triple). This will be described in greater detail later 
OUT] Thus, the procedure shown in Figure 7 generates a flow of new matched points determined using the calcu- 
lated camera transformations for input to subsequent initial feature matching operations, and possibly also to subse- 
quent calculating camera transformation operations. 

willnow bTdBMribid ^ P6rf0rmed by CPU 4 f ° r automatlc init ' al featur9 matching at steps S52 and S54 in Figure 7 

S«tohin F f 9U l e S r S T WS ° perations P erfor ™d by CPU 4 at step S52 when performing automatic initial feature 
matching for the first pair of images in the triple. 
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[0120] At step S80, a value is calculated tor each pixel in the first image of the triple indicating the amount ol "edge" 
and "corner" for that pixel. This is done, for example, by applying a conventional pixel mask to the first image, and 
moving this so that each pixel is considered. Such a technique is described in "Computer and Robot vision Volume 
1", by R.M. Haralick and L.G. Shapiro, Section 8, Addison -Wesley Publishing Company, 1992, ISBN 0-201-10877-1 
5 (V.1). At step SS2, any pixel which has "edge" and "corner" values exceeding predetermined thresholds is identified 
as a strong corner in the first image, in a conventional manner. At step S84 ( CPU 4 performs the operation previously 
carried out at step SB0 for the first image for the second image, and likewise identifies strong corners in the second 
image at step S86 using the same technique previously performed at step S82. 

[0121] At step S88, CPU 4 compares each strong corner identified in the first image at step S82 with every strong 

10 corner identified in the second image at step S86 which lies within a given area centred on the pixel in the second 
image which has the same pixel coordinates as the corner point under consideration in the first image to produce a 
similarity measure for the corners in the first and second images. In this embodiment, the size of the area considered 
in the second image is +10 pixels of the centre pixel in the y-direction and +200 pixels of the centre pixel in the x- 
direction. The use of such a "window" area to restrict the search area in the second image ensures that similar points 

is which lie on different parts of object 24 are not identified as matches. The window in this embodiment is set to have a 
small "y" value (height) and a relatively large "x" value (width) since it has been found that the images of object 24 are 
often recorded by a user with camera 12 at approximately the same vertical height (so that a point on the surface of 
object 24 is not displaced significantly in the vertical (y) direction in the images) but displaced around object 24 in a 
horizontal direction. In this embodiment, the comparison of points is carried out using an adaptive least squares cor- 

20 relation technique, for example as described in "Adaptive Least Squares Correlation: A Powerful Image Matching 
Technique" by A W. Gruen in Photogrammetry Remote Sensing and Cartography 1985 pages 175-187. 
[0122] At step S90, CPU 4 identifies and stores matching points. This is performed using a "relaxation" technique, 
as will now be described. Step S88 produces a similarity measure between each strong corner in the first image and 
a plurality of strong corners in the second image (that is, those lying within the window in the second image described 

25 above). At step S90, CPU 4 effectively arranges these values in a table array, for example listing all of the strong 
corners in the first image in a column, all of the strong corners in the second image in a row, and the similarity measure 
for each given pair of corners at the appropriate intersection in the table. In this way, rows of the table array define the 
similarity measure between a given corner point in the first image and each corner point in the second image (the 
similarity measure may be zero if the corner in the first image was not compared with the corner in the second image 

30 at step S88). Similarly, the columns in the array define the similarity measure between a given corner point in the 
second image and each corner point in the first image (again, some values may be zero if the points were not compared 
at step S88). CPU 4 then considers the first row of values, selects the highest similarity measure value in the row, and 
determines whether this value is also the highest value in the column in which the value lies. If the value is the highest 
in the row and column, this indicates that the comer point in the second image is the best matching point for the point 

35 jn the first image and vice versa. In this case, CPU 4 sets all of the values in the row and the column to zero (so that 
these values are not considered in further processing), and determines whether the highest similarity measure is above 
a predetermined threshold (in this embodiment, 0.1). If the similarity measure is above the threshold, CPU 4 stores 
the point in the first image and the corresponding point in the second image as matched points. If the similarity measure 
is not above the predetermined threshold, then it is determined that, even though the points are the best matching 

40 points for each other, the degree of similarity is not sufficient to store the points as matching points. 

[0123] CPU 4 then repeats this processing for each row of the table array until all of the rows have been considered. 
If it is determined that the highest similarity measure in a row is not also the highest for the column in which it lies, CPU 
4 moves on to consider the next row Thus, it is possible that no pairs of matching points are identified in step S90. 
[0124] CPU 4 reconsiders each row in the table array to repeat the processing above if matching points were identified 

45 the previous time all the rows were considered. CPU 4 continues to perform such iterations until no matching points 
are identified in an iteration. 

[0125] Figure 9 shows the steps performed by CPU 4 at step S54 in Figure 7 when performing automatic initial 
feature matching for the second pair of images in a triple. In this case, points in the first image of the pair have already 
been identified: strong corners in steps S84 and S86 of Figure 8 when the previous pair of images was considered; 
50 and other feature points from automatic initial feature matching (step S54), affine initial feature matching (steps S60, 
S64 and S72) and constrained feature matching (step S74) if these steps have been performed for the previous triple 
ol images. Accordingly CPU 4 needs only to identify strong corners in the second image of the pair (the third image 
ol the triple under consideration). 

[0126] Referring to Figure 9, at step S92, CPU 4 applies a pixel mask to the third image of the triple and calculates 
55 a value for each pixel in the third image indicating the amount of edge and corner for that pixel. This is performed in 
the same way as the operation in step S80 described previously. In step S94, CPU 4 identifies and stores strong comers 
in the third image. This is performed in the same way as step S82 described previously. At step S96, CPU 4 considers 
the strong points previously identified and stored at step S86, S54, S60 : S64, S72 and S74 for the second image in 
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the triple and the strong corners identified and stored at step S94 forthe third image in the triple and calculates a 
similarity measure between pairs of points. This is carriedout in the same way as .tap S8B daao^ XXS^ 

At'steD m ^rn?^ P °J mS " th ! ,hird ima9S Which are — P-e/againsfeach point in «he%Zd image 
At step S98, match.ng pants .n the second and third images of the triple are identified and stored This is carried out 
in the same way as step S90 described previously. mis is earned out 

[0127] It has been found that the feature matching technique performed by CPU 4 al steps S52 and S54 (described 
above) may not accurately generate matched points if the object 24 contains a plurality of feature points which ook 
simrtar, that ,s, ,f a number of points having the same visual characteristics are distributed over th£Z£^££ 
24. This is because, in this situation, points may have been matched in images which, although they have the same 
visua characteristics, do not actually represent the same physical point on the surface of ^l^TZ^cooZ 
of th.6, in this embodiment, a second initial feature matching technique is performed by CPU 4 which divides an image 

So r 9 a sma " t ber of poin,s which are known ,o be accurate * m ^ hed — £5. sss 

tries to match points in corresponding small regions within each image. This second technique assumes that the small 
regions created are flat (rather than exhibiting perspective qualities), so that an -afflne- transformation betweenThe 

SSSSSS^^ can be calcula,ed - The second technique is ,here,ore re,erred ,o as an ^ 

[0128] Figures 10a and 10b illustrate the difference between an object exhibiting perspective properties (Figure 10a) 
and an object exhibiting afflne properties (Figure 10b). (The other type of image that could be input to me^ 6 to 
processing by CPU 4 is an image of a flat object y 

*£JZ aLmmdn! e^K?* " ^ * ^ * *» «** since a„ the points on the 

SJ?!L l h V? BV i0 CP , U 4 PGrf ° rmS 9ffine ini,ia ' ' eatUre matchin 9 ,or tne ,irst P air °» ima 9^ in the triple at step 
nwW 36 Pa ' r ' ma98S in the ,riple at step S64 in Fi 9 ure 7 wi " now be described 

£131] Figure 11 shows, at a top level, the operations performed by CPU 4 when carrying out afflne initial feature 
matching across a pair of images in a triple at step S62 or S64 in Figure 7 

I01 f! ] . R !u rrinS t0 R9Ure 11 ' at Step S100 ' CPU 4 consi °ers the points in each image of a pair which have been 
She 8 r r edQ T tS f j! ^".r 96 bV me 31 S,6P S6 ° ° f S72 ' and -the ima'ge data to detune 

identify match.ng points in the images (points calculated by CPU 4, e.g. at step S52, S54, S62, S64 or S74 may not 
be accurate, and are therefore not used in step S 100 in this embodiment) 

f^oi 3 , 3 ] ? 9U ? 12 j ShOWS ,he wa ^ in which ste P S10 ° * performed by CPU 4. Referring to Figure 12, at step S106 
CPU 4 calculates the non-binary strength of any edge lying between the identified points in the first image ol the pair 
(that is, points ^ which were previously identified by the user as corresponding to points in the second image of the pair) 
and at step . S 108, CPU 4 performs the same calculation for the identified points in the second image of the pair (that 
n7£', % W6re previous| y identlTled b V the "ser as corresponding to points in the first image of the pair) 
[0134] Figures 13 and 14 show the way in which edge strengths are determined by CPU 4 at steps S106 and S108 

nninHn R9ferrin9 '° ^ ^ ° PU 4 C ° nSid8rS ima9e data in area " A " '^"9 betwee " «wo user-identified 
points 30, 32 in an image. The area A comprises pixels lying within a set number of pixels (in this embodiment two 
pixels) on erther side of the pixel through which a straight line connecting points 30 and 32 passes, and within end 
boundaries which are placed at a distance "a", in this embodiment corresponding to two pixels, from the points 30 32 
as shown in Figure 1 3. The pixels above and below the line are considered because user-identified points (e g points 
30, 32) may not have been positioned accurately by the user during identification on the display and therefore the 
edge (if any) may not run exactly between the points. If points 30, 32 are positioned within the image such that a line 
therebetween is more vertical than horizontal, then two pixels either side of the pixel through which the line passes 
are considered, rather than two pixels above and below the line. The end boundaries are set because it has been 
found that points in an image matched by a user at step S60 or step S72 in Figure 7 with points in another image tend 
to be points which lie at the end of edges (that is, corners). Pixels close to these points distort the orientation calculations 
which are used to identify edges if the points do indeed lie at the end of edges. This is because the edges become 
curved near points 30, 32, giving the individual pixels different orientation values to those in the centre region between 
the points. For this reason, pixels within two pixels of the points 30, 32 are omitted from the calculation of strenqth/ 
orientation. 

[0135] Referring to Figure 14, at step S114, CPU 4 smooths the image data in a conventional manner, for example 
To 0 Mn C 8d ' n chapter4of "Scale-Space Theory in Computer Vision" by Tony Lindeberg, Kluwer Academic Publishers 
ISBN 0-7923-941 8-6. A smoothing parameter of 1 .0 pixels is used in this embodiment (this being the standard deviation 
t>5 of the mask operator used in the smoothing process). 

[0136] At step S115, CPU 4 calculates edge magnitude and direction values for each pixel in the image This is done 
by applying a pixel mask in a conventional manner, for example as described in "Computer and Robot vision" by 
Harahck and Shapiro, Addison Wesley Publishing Company, Page 337-346, ISBN 0-201-10877-1 (V1) Inthisembod- 
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iment, at step S114 the data 1or the entire image is smoothed and at step S115 edge magnitude and direction values 
are calculated for every pixel. However, it is possible to select only relevant areas of the image for processing in each 
of these steps instead. 

[0137] At step S116, CPU 4 considers the pixels lying within area A between each pair of user-identified points, and 

s calculates the magnitude of any edge line between those points. Referring again to Figure 13, CPU 4 starts by con- 
sidering the first column of pixels in the area A, for example the column of pixels which are left-most in the image. 
Within this column, it first considers the top pixel, and compares the edge magnitude and edge direction values calcu- 
lated at step S1 15 for this pixel against thresholds. In this embodiment the magnitude threshold is set at a very low 
setting of 0.01 smooth grey levels per pixel. This is because edges often become "weakened" in an image, for example 

10 by the lighting, which can produce shadows etc. across the edge. Accordingly by using a small magnitude threshold, 
it is assured that all pixels having any reasonable value of edge magnitude are considered. The direction threshold is 
set so as to impose a relatively strict requirement for the direction value of the pixel to lie within a small angular deviation 
(in this embodiment 0.5 radians) of the direction of the straight line connecting points 30 and 32. This is because 
direction has been found to be a much more accurate way of determining whether the pixel actually represents an 

is edge than the pixel magnitude value. 

[01 38] If the top pixel in a column of pixels has values above the magnitude threshold and below the direction thresh- 
old, then a "vote" is registered for that column, indicating that part of an edge between the points 30, 32 exists in that 
column of pixels. If the values of the top pixel do not meet this criteria, then the same tests are applied to the remaining 
pixels in the column, moving down the column. Once a pixel is found satisfying the threshold criteria, a "vote" is reg- 

20 istered for the column and the next column of pixels is considered. On the other hand, if no pixel within the column is 
found which satisfies the threshold criteria, then no "vote" is registered for the column. When all of the columns of 
pixels have been processed in this manner, CPU 4 determines the percentage of columns which have registered a 
Vvote", this representing the strength of the edge, and stores this percentage. 

[0139] Referring again to Figure 12, after performing steps S106 and SI 08, CPU 4 has calculated and stored a 

25 strength for each edge in each image of the pair. 

[0140] At step S110, CPU 4 calculates the combined strength of corresponding edges in the first image of the pair 
and the second image of the pair. This is done, for example, by reading the stored percentage edge strength calculated 
at step S106 for an edge in the first image and the value calculated in step S108 for the corresponding edge in the 
second image and calculating the geometric mean of the percentages (that is, the square root of the product of the 

30 percentages). If the resulting, combined strength value is less than 90%, CPU 4 determines that the edges are not 
sufficiently strong to consider further, and discards them. If the combined strength value is 90% or greater, CPU 4 
stores the value and identifies the edges in both images as strong edges for future use. 

[0141] By performing step S110, CPU 4 effectively considers the strength of an edge in both images of a pair to 
determine whether an edge actually exists between given points. In this way an edge may still be identified even if it 
35 has become distorted (for example, broken) somewhat in one of the images since the strength of the edge in the other 
image will compensate. 

[0142] At step S112, CPU 4 considers the strong edges in the first image of the pair, that is the edges which remain 
after the weak ones have been removed at step S 1 1 0, and processes the image data to remove any crossovers between 
the edges. 

40 [0143] Figure 1 5 shows the operations performed by CPU 4 in determining whether any crossovers occur between 
the edges and removing them. Referring to Figure 15, at step S120, CPU 4 produces a list of the edges in the first 
image of the pair arranged in combined strength order, with the edge having the highest combined strength at the top 
o1 the list. Since the strength of the edges is calculated and stored as floating point numbers, it is unlikely that two 
edges will have the same combined strength. At step S122, CPU 4 considers the next pair of edges in the list (this 

45 being the first pair the first time the step is performed), and at step S1 24, CPU 4 compares the coordinates of the points 
at the ends of each edge to determine whether both end points of the first edge lie on the same side of the second 
edge. If it is determined that they do, CPU 4 determines at step S126 that the edges have a relationship corresponding 
to the case shown in Figure 16a and that therefore they do not cross. On the other hand, if it is determined at step 
S124 that both end points of the first, edge do not lie on the same side of the second edge, then the edges have a 

so relationship corresponding to either that shown in Figure 1 5b or that shown in Figure 1 6c. To determine which, at step 
S1 28, CPU 4 again considers the coordinates of the points to determine whether both end points of the second edge 
lie on the same side of the first edge. If they do, CPU 4 determines at step S1 26 that the edges do not cross, the edges 
corresponding to the case shown in Figure 16b. If it is determined that both end points of the second edge do not lie 
on the same side of the first edge at step S128, then CPU 4 determines that the edges cross, as shown in Figure 16c, 

55 and at step S130 deletes the second edge of the pair, this being the edge with the lower combined strength. This is 
done by setting the combined strength of the edge to zero, thereby effectively deleting the edge from both the first and 
second images. At step S132, CPU 4 determines whether there is another edge in the list which has not yet been 
compared. Steps S122 to S132 are repeated until all edges have been considered in the manner just described. That 
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Kl0??c^uK* ( r iflUW 11 ' a L S,6P S104 ' CPU 4 US6S ,he ,rian3l6S defined fram "ser-identified points in 
step bi02 to calculate further corresponding points in a pair of images 

iril th F ' 9Ure 1 5 Sh .° WS ^ ° perations Performed by CPU 4 in step SI 04. Referring to Figure 1 8, at step S160 CPU 



(i) 



where (x,y, 1 ) are the homogeneous coordinates of the point in the first imaae of the Dair fx 1 «• 1 w» th= i 

Sertex of r t SS?S2SS!" A t0 f F ' CP k U 4 aSSUmeS ,hat ,hS transformation is the same for each 

represemed n the tlZ * T ^ tnan9 ' e iS SUffident * Sma " that the P ortion of ,he °' the object 

usin^^ th J» J 9 withm a triangle can be assumed to be flat), so that the following equation can be se Z 
using the three known vertices o, the triangle in the firs, image and the three known corresponding po^he second 
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where (x,y,1 ) are the homogeneous co-ordinates of a triangle vertex in the first image, the co-ordinate numbers indi- 
15 eating with which vertex the co-ordinates are associated, and (x',y\1 ) are the homogeneous co-ordinates of the point 
in the second image which is matched with the triangle vertex in the first image (again, the co-ordinate numbers indi- 
cating with which vertex the point is matched). This equation is solved in a conventional manner to calculate values 
for A to F and hence define the transformation for each triangle. 

[0152] At step S162 ; CPU 4 divides the first image into a series of grid squares of size 25 pixels by 25 pixels, and 
20 sets a flag for each square to indicate that the square is "empty". Figure 1 9 illustrates an image divided into grid squares. 
At step S1 64, CPU 4 determines whether there are any points in the first image of the pair under consideration which 
have been matched with a point in the preceding image in the sequence but which have not been matched with a point 
in the second image of the pair. When the first image of the pair under consideration is the very first image in the 
sequence (the image taken at position L1 in the example of Figure 2) then there are no such points since there is no 
25 preceding image in the sequence. When the second image in the sequence (the image taken at position L3 in the 
example of Figure 2) is the first image in the pair under consideration, it will be seen from Figure 7 that points may 
have been matched with the preceding image (the first image in the sequence) by automatic initial feature matching 
at step S52, by user matching at step S60 or step S72 or by affine initial feature matching at step S62. When the first 
image of the pair under consideration is the third or a subsequent image in the sequence (one of the images taken at 
30 positions L2, L4 or L5), points may have been matched with the preceding image by automatic initial feature matching 
at step S54, by user matching at step S60 or step S72, by affine initial feature matching at step S62 or step S64, or 
additionally by constrained feature matching at step S74, as described previously and as described in greater detail 
later. 

[0153] Referring again to Figure 18, if CPU 4 determines at step SI 64 that such points exist, at step SI 66 it considers 

os one of the points, referred to as a "previously matched" point, and at step S168 determines whether this point lies 
within a triangle created at step S1 02 in Figure 11 in the first image of the pair. If the point does not lie within a triangle, 
the processing proceeds to step S178 where CPU 4 determines whether there is another previously matched point in 
the first image of the pair. Steps S166, S168 and S176 are repeated until a previously matched point lying within a 
triangle in the first image of the pair is identified, or until all such previously matched points have been considered. 

40 When it is determined at step S168 that the previously matched point being considered does lie within a triangle in the 
first image of the pair, at step S170, CPU 4 tries to find a corresponding point in the second image of the pair. This is 
done by applying the affine transformation for the triangle in which the point lies (previously calculated at step S160) 
to the co-ordinates of the point to identify a point in the second image, and then applying an adaptive least squares 
correlation routine, such as the one described in the paper "Adaptive Least Squares Correlation: A Powerful Image 

45 Matching Technique" by A.W. Gruen, Photogrammetry Remote Sensing and Cartography, 1985, pages 175-187, to 
consider the identified point in the second image and points in a small area around it to determine whether any point 
has the same image characteristics as the previously matched point in the first image of the pair. This produces a 
similarity measure for a point in the second image. At step S172, CPU 4 determines whether a corresponding point in 
the second image of the pair has been found by comparing the similarity measure with a threshold (in this embodiment, 

50 0:4). If the similarity measure is greater than the threshold, it is determined that the point in the second image having 
this similarity measure corresponds to the previously matched point in the first image and at step S1 74, CPU 4 changes 
the flag for the grid square in which the point in the first image lies to indicate that the grid square is "full". At step SI 76, 
CPU 4 stores data identifying the points as matched. 

[0154] At step S178, CPU 4 considers whether there is another previously matched point in the first image of the 
55 pair not yet considered, and if there is, steps S166to S178 are repeated until all previously matched points in the first 
image of the pair have been processed in the manner just described. 

[0155] When all of the previously matched points in the first image of the pair have been processed, or if it is deter- 
mined at step S164 that there are no previously matched points, then at step S180, CPU 4 considers the next empty 
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End at'^o 318^5 1!, T 6 * eqU „ al l °' " ab ° Ve ' ,he ,hreSh ° ,d (indicati "9 ,hat ' ha P<*« is suitable for 

ZTe tha !S fhTZ,?™ f!?/ COmpar,n 9 ,he similari, V maa ^<> with a threshold. If the simLty measure 
flMhZL ^d £ iSSS f 4 determines that the point identified in the second image matches the point in the 

no m^hl k P „ ° feS ,he matCh " the Simi,arrtv measure is below the threshold, CPU 4 detelne "tta 
no matching point has been found in the second image oeiermmes mat 
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be chanold 6 T*?' "* °' P ° imS h ,he ** ima9<3 ° f ,he ^ t0 ^considered for ma «^ning Tan 

an9 ' n9 , , S ' Ze °' SqUar8S in the 9rid - lf the squares are made smalle ' then a ~ e of 
o^etdt^ 

K^teU^CPU^ 

l S l P t 7 0 ' ° P V. 4 de,erm,nss Aether the .mages in the triple, for which the camera transformations are to be 
calculated are the f.rst three .mages in the positional sequence Referring again to Figure 7, when the fiSuhreeTmaqes 

the nZ Z ,ranS, r a, ' 0nS firS ' Pair °* ima98S in the tri P' e have not baen «*<"«ed previously. However when 

fourth ^ inlh? s ^ uence , 1S f C ° nSid9red ' the tri P |e 01 being processed comprises the second, third and 

fourth .mages in the sequence. In this case, the camera transformations between the second and third imaoes in the 

a re^riSS. * Se °° ndand h the sec ' uence ) Similarly, when subsequent images of the sequence 

connpT, tt' 1 Camera trans,ormations for ,he pair of images will also have been calculated previously" 
connection with the previous triple of images H'oviuu&iy in 

™Ln Wh6n rf the c ™* ra J™ s,omations for,he firsl P ai '°' im^es ^ the triple have been calculated previously the 
olrforml h"« T , , 4 iS Simplified by USin9 the P reviously calculated transformations. Accordingly, CPU 4 
Tn theTin,! hf re h 09 at ' 0n ^° Utine dePendin9 UP ° n Wh6,h8r the Camera transformations for the first pair of images 
lonlT h rT" 8 ^ ca,cula,ed: a first ™«™ is performed in step S202 when the triple of images being 

S?04 to l?T P 7 eS threS jma9eS " the P ° Siti0nal S6qUenCe ' and a second rau, ine ^performed at step 

o<dU4 tor other triples of images. K 

n 3 ir The | CalCUlali0n r °, U i ine performed at ste P S202 for the triple of images comprising the first three images in 
the positional sequence will be described first. »"a a es, in 

^SM^tnl 8 ^ f 9 ,OP 'c Vel ' ,h o ° Pera,i0nS Perf ° rmed by CPU 4 in P^ming the calculation routine at 
step S202 in F.gure 20 Referring to F.gure 21 , at step S206, CPU 4 sets up the parameters necessary for the calcu- 
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lation. At step S208, CPU 4 calculates the camera transformations between the first pair of images in the triple and 
stores the results, and at step S210, CPU 4 calculates the camera transformations between the second pair of images 
in the triple and stores the results. At step S212, the camera transformations for the first pair of images calculated at 
step S208 and for the second pair of images calculated at step S210 are used to calculate the camera transformations 
for all three images in the triple, these transformations then being stored. 

[0165] Figure 22 shows the operations performed by CPU 4 in setting up the parameters at step S206. Referring to 
Figure 22, at step S214, CPU 4 reads the camera data input by the user at step S30 (Figure 4). At step S216, CPU 4 
reads the points matched in the first pair of images of the triple during initial feature matching at steps S52 : S60, S62 
and S72 (Figure 7) and the points matched in the second pair of images in the triple during initial feature matching at 
steps S54 : S60, S64 and S72 (Figure 7). 

[0166] At step S218, CPU 4 generates, for each pair of images, a list of the matched points which are user-identified 
(that is, identified by the user at step S60 or S72 in Figure 7) and a list of matched points comprising both points 
calculated by CPU 4 as matching (at steps S52, S54 : S62 or S64 in Figure 7) and user-identified points. Some of the 
calculated matching points may be the same as user-identified matching points. If this is the case, CPU 4 deletes the 
CPU-calculated points from the list so that there are no duplicate pairs of matching points. By deleting the CPU-cal- 
culated points, CPU 4 ensures that a point appears in both of the lists which will be used for the calculations (one of 
these lists being user-identified points alone, and hence the point would not appear in this list if user-identified points 
were deleted to remove duplicates). The number of points in the list of user-identified matching points may be zero. 
This will be case if affine initial feature matching at steps S60 to S72 in Figure 7 has not been performed. 
[0167] Also at step S21 8, CPU 4 generates a list of "triple" points, that is, points (including both user-matched points 
and CPU-calculated points) which are matched across all three images in the triple of images being considered. 
[0168] At step S220, CPU 4 normalises the co-ordinates of the points in the lists created at step S218. Up to this 
point, the co-ordinates of the points are defined in terms of the number of pixels across and down the image from the 
top left-hand corner of the image. At step S220, CPU 4 uses the camera focal length and image plane (film or CCD) 
size read at step S21 4 to convert the co-ordinates of the points from pixels to a co-ordinate system in millimetres having 
an origin at the camera optical centre. The millimetre coordinates are related to the pixel coordinates as follows: 



30 



x*= h x (x-C) 



y = -vx (y-C y ) 



(3) 
(4) 



35 



where (x*,y*) are the millimetre coordinates, (x,y) are the pixel coordinates, (C x ,C y ) is the centre of the image (in pixels), 
which is defined as half of the number of pixels in the horizontal and vertical directions, and "h" and V* are the horizontal 
and vertical distances between adjacent pixels (in mm). 

[0169] CPU 4 stores both the millimetre coordinates and the pixel coordinates. 

[0170] At step S222, CPU 4 sets up a measurement matrix, M, as follows for each of the list of user-identified points 
and the list of user-identified and calculated points generated at step S218: 



vT 
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M = 



x 2*2 ~y 2 x 2 x 2 -x 2 y 2 y 2 y^ 



-x k y!< y k y k 



yi x i -Yi i 
y 2 i 



-yi x, 



-y k * k -y k i 



(5) 



55 



where (x,y) are the pixel co-ordinates of the point in the first image of the pair (x',y') are the pixel co-ordinates of the 
corresponding (matched) point in the second image of the pair, and the numbers 1 to k indicate to which pair of points 
the co-ordinates correspond (there being k pairs of points in total in the list - which may, of course, be different for the 
user-identified points list and the user-identified and calculated points list). 
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ET- 1 A< fTf 2 ^' CPU 4 determines the number <* iterations to be performed for the four different calculation 
techniques that ,1 will use to calculate the camera transformations for the first pair of images and the four different 
calcubtion techniques that it will use to calculate the camera transformations for the second pair of images The four 
techniques used to calculate the camera transformations (the same techniques being used for the first pair of images 
and the second pair of images) are: a perspective calculation using the list of user-identified points; a perspective 
calculation using the list of both user-identified and calculated points; an affine calculation using the list of user-identified 
points: and an affine calculation using the list of both user-identified and calculated points 

[01721 Figure 23 shows the steps performed by CPU 4 at step S224 in Figure 22 to determine the number of iterations 
use H d '" each ca ^ «on. Referring to Figure 23, at step S230, CPU 4 considers one of the lists produced at step 
S218 and determines whether the number of points in that list is less than four. If il is, then at step S232 CPU 4 sets 
m number of iterations, »np". to be performed for the perspective calculation using the points in that list to zero and 
the number of iterations, "na", to be performed for the affine calculation using the points in that list to be zero too That 

11 l8 J°"" 31 S,ep S23 ° ,hat ,he number of P° ints in the list is less than f o^. the number of iterations is set to zero 
at step S232 to ensure that neither the perspective calculation nor the affine calculation is performed since there are 
not enough pairs of matching points. 

ooov'oJ.'.'V! de,ermined al s,e P S230 ,nat the ™n*er °< Pairs of points in the list is not less than four, then at step 
S234, CPU 4 determines whether the number of pairs of points is less than seven. If it is, then at step S236 the number 
of iterations, "np", for the perspective calculation using the points in the list is set to zero (since again there are not 
sufficient points 1o perform the calculation), and the number of iterations, "na", to be used when performing the affine 
calculation for the points in the list is set to be fifteen. The value "na" is set to 1 5 because this represents the maximum 
number of iterations it is possible to perform without repetition using six pairs of points (the highest number less than 
seven) in the affine calculation. 

L°oo 7 o LI', * fe determined at s,e P S234 ,nat the " um °er of pairs of points in the list is not less than seven, then at step 
S238 CPU 4 sets the number of iterations, "np", to be performed for the perspective calculation using the points in the 
list to be the minimum of 4,000 and the integer part of k(k-l)(k-2)(k-3)(k-4)(k-5)(k-6)/20160, and sets the number of 
iterations, na , to be performed for the affine calculation using the points in the list to be the minimum of 800 and the 
integer part of k(k-1 )(k-2)(k-3)/48. As will be seen later, the value k(k-1 )(k-2)(k-3)(k-4)(k-5)(k-6)/201 60 represents 25% 
of he maximum number of iterations it is possible to perform without repetition for the perspective calculation and the 
value k(k-1 )(k-2)(k-3y48 represents 50% of the maximum number of iterations it is possible to perform without repetition 
for the affine calculation The values 4,000 and 800 are chosen since they have been determined empirically to produce 
acceptable results in a reasonable time limit. 

[01 75] The operations described above with respect to Figure 23 are performed for each of the lists set up at step 
S218, with the exception of the list of "triple" points, to calculate, the number of iterations to be performed in all four 
camera transformation calculation techniques for the first pair of images and for the second pair of images 
[0176] Figure 24 shows, at a top level, the operations performed by CPU 4 when calculating the camera transfor- 
mations for the first pair of images in the triple at step S208 (Figure 21 ), and when calculating the camera transforma- 
tions for the second pair of images in the triple at step S210 (Figure 21). Referring to Figure 24, at step S240 CPU 4 
calculates the camera transformation between the pair of images using a perspective calculation, and stores the'results 
At step S242, CPU 4 calculates the camera transformations for the image pair using an affine calculation and stores 
the results. That is, CPU 4 calculates the camera transformations for each pair of images using two techniques each 
corresponding to a respective one of the two possible types of image that can be input for processing (as noted pre- 
viously, for the third type of image, namely images of a flat object, it is not possible to perform processing to generate 
a 3D model of the object). 

[0177] Figure 25 shows the operations performed by CPU 4 when calculating the camera transformations using a 
perspective calculation at step S240 in Figure 24. Referring to Figure 25, CPU 4 first performs the perspective calcu- 
lation using the pairs of points in the list of user-identified points (steps S244 to S262) and then using the pairs of points 
in the list containing both user-identified points and calculated points (steps S264 to S282) CPU 4 then determines 
which list of points produced the most accurate results, and converts these results into calculated camera transforma- 
tions for the pair of images (step S284). These processing operations provide the advantage that the transformation 
is calculated using a plurality of different sets of points, thereby giving a greater probability that an accurate transfor- 
mation will be calculated. The operations will now be described in greater detail 

[0178] Referring to Figure 25, at step S244, CPU 4 reads the value for the number of iterations to be performed for 
the perspective calculation using the user-identified points which was set at step S224 (Figure 22) and determines 
whether this value is greater than zero. If it is not, then the processing proceeds to step S264, which is the start of the 
processing operations for the perspective calculation using the list of both user-identified and calculated points since 
there are not sufficient user-identified points alone on which to perform the perspective calculation 
[01 79] On the other hand, if it is determined at step S244 that the number of iterations is greater than zero at step 
S246 CPU 4 increments the value of a counter by one (the first time step S246 is performed, CPU 4 setting thecounter 
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value to one). At step S248 : CPU 4 selects at random seven pairs of points from the list of matched user-identified 
points set up at step S218 (Figure 22). At step S250, CPU 4 uses the selected seven pairs of points and the meas- 
urement matrix set at step S222 to calculate the fundamental matrix, F, representing the geometrical relationship be- 
tween the images, F being a three by three matrix satisfying the following equation: 



(*■' y ; 1) F 



y 

\1/ 



(6) 



15 



20 



where (x,y,1 ) are the homogeneous pixel co-ordinates of any of the seven selected points in the first image of the pair : 
and (x\y\1) are the corresponding homogeneous pixel co-ordinates in the second image of the pair. 
[0180] The fundamental matrix is calculated in a conventional manner, for example using the technique disclosed in 
"Robust Detection of Degenerate Configurations Whilst Estimating the Fundamental Matrix" by RH.S. Torr, A. Zisser- 
man and S. Maybank, Oxford University Technical Report 2090/96. 

[0181] It is possible to select more than seven pairs of matched points at step S248 and to use these to calculate 
the fundamental matrix at step S250. However, seven pairs of points are used in this embodiment: since this has been 
shown empirically to produce satisfactory results, and also represents the minimum number of pairs needed to calculate 
the parameters of the fundamental matrix, reducing processing requirements. 

[0182] At step S252, CPU 4 converts the fundamental matrix, F into a physical fundamental matrix, F phys , using the 
camera data read at step S21 4 (Figure 22). This is again performed in a conventional manner, for example as described 
in "Motion and Structure from Two Perspective Views: Algorithms, Error Analysis and Error Estimation" by J. Weng ; T 
S. Huang and N. Ahuja, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. .11, No. 5, May 1989, 
page 451-476, and as summarised below. 

[0183] First the essential matrix, E : which satisfies the following equation is calculated: 



30 



y*' f)E 



y* 



(7) 



35 where (x*. y* ; f) are the co-ordinates of any of the selected seven points in the first image in a millimetre co-ordinate 
system whose origin is at the centre of the image, the z co-ordinate having being normalised to correspond to the focal 
length, f, of the camera, and (x*\ y*\ f) are the corresponding co-ordinates of the matched point in the second image 
of the pair. The fundamental matrix, F, is converted into the essential matrix, E, using the following equations: 



A = 



(l/h 0 
0 1/v 
0 0 



-Cy/f 



(8) 
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M= A FA 



(9) 
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tr(M T M) 



55 



where the camera parameters "h", V\ B c x , \ "Cy" and T are as defined previously, the symbol T denotes the matrix 
transpose, and the symbol "tr" denotes the matrix trace. 

[0164] The calculated essential matrix, E, is then converted into a physical essential matrix, "E phys \ by finding the 
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closest matrix to E which is decomposable directly into a translation vector (of unit length) and rotation matrix (this 
closest matrix being E phys ). 1 

[0185] Finally, the physical essential matrix is converted into a physical fundamental matrix, using the equation: 

l ~phys- A E phys A (11) 

where the symbol "-1 " denotes the matrix inverse. 

[0186] Each of the physical essential matrix, E phys , and the physical fundamental matrix, F D(ws is a "physically real- 
isable matrix , that is, it is directly decomposable into a rotation matrix and translation vector 

[0187] The physical fundamental matrix, F phye , defines a curved surface in a four-dimensional space, represented 
by the coordinates (x, y, x", y') which are known as "concatenated image coordinates". The curved surface is given by 
Equation 6 above, which defines a 3D quadric in the 4D space of concatenated image coordinates 
[0188] At step S253, CPU 4 tests the calculated physical fundamental matrix against each pair of points that were 
used to calculate the fundamental matrix at step S250. This is done by calculating an approximation to the 4D Euclidean 
distance (in the concatenated image coordinates) of the 4D point representing each pair of points from the surface 
representing the physical fundamental matrix. This distance is known as the "Sampson distance", and is calculated in 
a conventional manner, for example as described in "Robust Detection of Degenerate Configurations Whilst Estimating 

^" MatriX " by P H S - T ° rr ' A ' Zisserman an ° S. Maybank, Oxford University Technical Report 2090/96 
[0189] Figure 26 shows the way in which CPU 4 tests the physical fundamental matrix at step S253 Referring to 
Figure 26 at step S290, CPU 4 sets a counter to zero. At step S292, CPU 4 calculates the tangent plane of the surface 
representing the physical fundamental matrix at the four-dimensional point defined by the co-ordinates of the next pair 
of points in the seven pairs of user-identified points (the two co-ordinates defining each point in the pair being used to 
define a single po.nt in the four-dimensional space of the concatenated image co-ordinates). Step S292 effectively 
composes shifting the surface to touch the point defined by the coordinates of the pair of points, and calculating the 
tangent plane at that point. This is performed in a conventional manner, for example as described in "Robust Detection 
ofDegenerate Configurations Whilst Estimating the Fundamental Matrix" by P.H.S. Torr, A. Zisserman and S Maybank 
Oxford University Technical Report 2090/96. 

[0190] At step S294, CPU 4 calculates the normal to the tangent plane calculated at step S292, and at step S296 
it calculates the distance along the normal from the point in the 4D space defined by the coordinates of the pair of 
matched points to the surface representing the physical fundamental matrix (the "Sampson distance") 
n- i i. S i eP . S298 : me caloula,ed distance is c °mpared with a threshold which, in this embodiment, is set at 2 3 

funH«ml^ , anCe ' S ! " ,h8 ,hr8Sh0ld ' th6n ,hS P0int li8S suffici °n"y to the surface, and the physical 
undamental mainx is considered to accurately represent the movement of the camera from the first image of the pair 
to he second , mage of the pair for the particular pair of matched points being considered. Accordingly if the distance 
is less than the threshold, at slep S300, CPU 4 increments the counter which was initially se, to zero a 
stores the points, and stores the distance calculated at step S296. 

[0192] At step S302, CPU 4 determines whetherthere is another pair of points in the seven pairs of points used to 
m3,riX ' St8PS 5292 ,0 5302 r6peated Un,H a " SUCh P ° ints ha " e beenZcessed a° 

Ed arelpsls^r,,^ 9 "/, 9 25 ' ^ V P S25 1' ° PU 4 d6termin9S Wh8lher the Physical f ^™ntal matrix cal- 
caSLrt n nint! f , I ^ '° JUS, " V fUr1her P rocessina «° * ^inst all of the user-identified and 

SStStoZiZ f ,men V ,eP 8254 15 Pert ° rmed by d8termini "9 wh *^ the counter value set a, step 
ar^n^H?, k . f Pa,fS " POm,S Whi ° h haVS 9 diS,anCe less ,han the threshotei '« s ' a P S298, and hence 

the 1Z , , V h CO T' S,ent W ' ,h PhySiCa ' ,lJndar " ental is equal to 7. That is, CPU 4 determines whether 

the physica fundamental matrix is consistent with all of the points used to calculate the fundamental matrix 

m CPU -1 th rOC h eSS,n 9 proceeds to ste P S 256. On the other hand, if the counter value is equal to 7, at step 
S!23-f fasts the physical fundamental matrix against each pair of points in the list containing both user-identified 
and calculated points (even though the physical fundamental mat™ has been derived using points from the7s7con 

Z JT? { ) 8 . k P ' PU 4 SStS ,he COUnter fo 7 to raflect tne seven P airs °» POi^s already tested at step S253 

k°^:T\ T T PhySiCa ' ' Undamen,al <"> ,hS Phys ' cal '-oamenta, maJix?s P ,esmd 

tested* aid ,H n rT, T T , TT ^ <a " h ° U9h th ° P8irS ° f P ° intS previous, y 1ested at s ' e P S253 ™ not re- 
tested), and („,) CPU 4 calculates the total error for all points stored at step S300, using the following equation- 
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Total error = - — £ ( 1 2) 

s 

where ej is the distance for the Vth pair of matched points between the 4D point represented by their co-ordinates and 
the surface representing the physical fundamental matrix calculated at step S296, this value being squared so that it 
is unsigned (thereby ensuring that the side of the surface representing the physical fundamental matrix on which the 
point lies does not affect the result), p being the total number of points stored at step S300 and e^ being the distance 
10 threshold used in the comparison at step S298. 

[0194] In step S255, the counter value and stored points at step S300 (Figure 26) and the total error described above 
include the seven pairs of points tested at step S253. 

[0195] The effect of step S255 is to determine whether the physical fundamental matrix calculated at step S252 is 
accurate for each pair of user-identified and calculated points, the value of the counter at the end (step S300) indicating 

is the total number of the points for which the calculated matrix is sufficiently accurate. 

[0196] At step S256 : CPU 4 determines whether the physical fundamental matrix tested at step S255 is more accurate 
than any previously calculated using the perspective calculation technique for the user-identified points alone. This is 
done by comparing the counter value stored at step S300 in Figure 26 for the last-calculated physical fundamental 
matrix (this value representing the number of points for which the physical fundamental matrix is an accurate camera 

20 solution) with the corresponding counter value stored for the most accurate physical fundamental matrix previously 
calculated. The matrix with the highest number of points (counter value) is taken to be the most accurate. If the number 
of points is the same for two matrices, the total error for each matrix (calculated as described above) is compared, and 
the most accurate matrix is taken to be the one with the lowest error. If it is determined at step S256 that the physical 
fundamental matrix is more accurate than the currently stored one, at step S25B the previous one is discarded, and 

25 the new one is stored together with the number of points (counter value) stored at step S300 in Figure 26. the points 
themselves, and the total error calculated for the matrix. 

[01 97] At step S260, CPU 4 determines whether the value of the counter incremented at step S246 is less than the 
value "np" set at step S224 in Figure 22 defining the number of iterations to be performed. If the value is not less than 
"np n , the required number of iterations has been performed, and the processing proceeds to step S264 in order to carry 

30 out the perspective calculation for the points in the list comprising both user-identified points and calculated points. 
Alternatively, if the required number of iterations has not yet been reached (value of the counter is still less than "np" 
at step S260), at step S262, CPU 4 determines whether the accuracy of the physical fundamental matrix (represented 
by the counter value and the total error stored at step S258) has increased at all in the last np/2 iterations. If it has : it 
is worthwhile performing further iterations, and steps S246 to S262 are repeated. If there has not been any change in 

35 the accuracy of the physical fundamental matrix in the last np/2 iterations, processing is stopped even though the 
number of iterations has not yet reached the value "np" set at step S224 in Figure 22. In this way, processing time can 
be saved in cases where performing the full number of iterations would not produce significantly more accurate results. 
[0198] As described above with respect to Figure 23, the value of "np" is set based on the number of pairs of points 
in the list of points from which the seven pairs are selected at random at step S248. Referring to step S238 in Figure 

40 23, the value (k-1)(k-2)(k-3)(k-4)(k-5)(k-6)/20160 represents 25% of the maximum number of iterations that it would 
be possible to perform without repetition (this maximum number being the total number of different combinations of 
seven pairs of points selected from the list). The value np/2 used at step S262 has been determined empirically to 
produce acceptable results in a reasonable time. 

[0199] Referring again to Figure 25 at steps S264 to S282, CPU 4 carries out the perspective calculation for the pair 
45 of images using pairs of points selected at random from the list comprising both user-identified and calculated points. 
The steps are the same as those performed at steps S244 to S262 : described above, with the exception that the value 
"np" defining the number of iterations to be performed has been set differently (step S224 in Figure 22), and the seven 
pairs of points used to calculate the fundamental matrix selected at random are chosen from the list comprising both 
user-identified and calculated points. The operations performed in this processing will not, therefore, be described 
so again. As before, Figure 26 shows the steps performed when testing the physical fundamental matrix against each 
pair of user-identified and calculated points (step S273 and step S275). 

[0200] At step S284, CPU 4 compares the most accurate physical fundamental matrix calculated using the user- 
identified points alone (stored at step S258) and the most accurate physical fundamental matrix calculated using both 
the user-identrfied points and calculated points (stored at step S27B), and selects the most accurate of the two (by 
55 comparing the counter values which represent the number of points for which the matrices are an accurate solution, 
and, if these are the same, the total error). The most accurate physical fundamental matrix is then converted to a 
camera rotation matrix and translation vector representing the movement of the camera between the pair of images. 
This conversion is performed in a conventional manner, for example as described in the above-referenced "Motion 
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an H^ r A U K tUre ,™ l W ° Pers P ective Views: Algorithms, Error Analysis and Error Estimation" by J. Weng, T.S Huang 
■ ^ Transa ctions on Pattern Analysis and Machine Intelligence, Vol. 1 1 , No. 5, May 1 989, page 451 -476 
SS LH« 7 m r0Ce f ng described above with respect to Figure 25, CPU 4 calculates a fundamental matrix (steps 
S250 and IS270), and converts this to a physical fundamental matrix (steps S252 and S272) for testing against the 

™Tn^ r P0 ' n,S H a , nd Ca ' CUla, t d P ° in,S (S,6PS 5255 9nd S2?5) ThiS h9S ,he adVa " ta9e *«. '"a 9 
process ng is required to convert the fundamental matrix to a physical fundamental matrix, the physical fundamental 
matnx utomately selected at step S284 has itself been tested. If the fundamental matrbc was tested against the use^ 
2Sf Ca,C " ,a,ed H P° ints < and the "»* accurate fundamental matrbc selected, this would then have to be con- 
verted to a physical fundamental matrix which would not, itself, have been tested 

n^LPi^lo^ t0 r i9 " re 2 \ CPU 4 haS n ° W com P |eted ,he Perspective calculations for the image pair and 
proceeds to step S242, ,n vvh lc h rt performs the second type of calculation, namely an affine calculation, for the image 

a 9 "! 8 27 Sh ° WS the °P era,ions Performed by CPU 4 when carrying out the affine calculations 
EEL u^Z Pf™ 9 *™ Perspective calculations, CPU 4 performs an affine calculation using pairs of points 
selected from the list of user-identified points alone (steps S310 to S327), and using pairs of points from the list of 
points comprising both user-identified points and calculated points (steps S328 to S345), and then selects the most 

^urS e oS a e T P f ^ A9ai " ! thiS Pr ° VideS ,he adVan,a9e ,hat ,he transformation is caiculated ushg a 

SS? ^f^T P °' ntS ' tnSreby 9iVin 9 a 9rea,er P robabilit V th at an accurate transformation will be calculated 
S m^r P P l ^ ! p9 [* pec « VB calculations, it is possible to calculate all of the components of the funda^ 
™ a '™ 1X ' ( R However, when the rebtionship between the pair of images is an affine relationship, it is possible to 
calculate only four independent components of the fundamental matrix, these four independent components defining 
what is commonly known as an "affine" fundamental matrix ^ oenning 

S Fi !! i Z to F l 9WB f S,6P S310 ' CPU 4 dete rmine S whether the number of iterations, «na», set at step 
S224 (Figure 22) for affme calculations using user-identified points alone is greater than zero If it is not there are 

Or h f 7, I ♦ P °' ntS com P r,sin 9 both user-identified points and calculated points is considered 

S s eo S312 CPu\ tZZtTT ** °* lt6rati ° nS to be Performed is greater than zero, 

S3?! >l performed) 9 (the ^ ° f the C ° Un,er bein9 set to one ,ne f " st time step 

£23 St f P f 14 ' ? PU 4 Sel6C,S 31 rand ° m '° Ur pairS °' matched P° in,s from the list of points containing user- 

SZSSEZZSZt ^ * 4 Se ' eC,ed f ° Ur PairS ° f P ° in,S and ,he measurement mat^eet at 

ul!nn f 2 i° J '"dependent components of the fundamental matrix (giving the "affine" fundamental matrix 

eChnl ?"V UCh 38 ,hat deSCribed ln " Affine AnalySiS ° f ,ma 9 e Sequences" by L.S. Shapiro SecttonTcam 

and to usTr'^ T reS ! 1 ? 95 ' k SBN 0-521 - 55 ° 63 - 7 " IS POSSible 10 Select ™ re than *»" P^rs of points at step S 3 T 4 
and to use these to calculate the affine fundamental matrix at step S316. However, in the present embed ment onlv 
four pairs are se.ected since this has been shown empiricaliy to produce satisfactory esults " alS rej "senis the 
minimum number requ.red to calcufate the components o, the affine fundamenta. matrix, reducing proc^n^l- 
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at s?eo S A 8 to m^ «™, determ.nea whether the affine fundamental matrix calculated at step S316 and tested 
clmn P r^ k , ,han any P revious| y calculated using the user-identified points alone This is done bv 
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of iterations., then the required number of iterations have been performed, and processing proceeds to step S328. If 
the value of the counter is less than the set number of iterations, CPU 4 performs a further test at step S326 to determine 
whether the accuracy of the affine fundamental matrix has increased at all in the last na/2 iterations. If the accuracy 
has not increased, then processing is stopped even though the set number of iterations, "na", has not yet been per- 
s formed. In this way, iterations which would not produce any increase in the accuracy of the affine fundamental matrix 
are not performed, and hence processing time is saved. On the other hand, if the accuracy has increased, steps S312 
to S326 are repeated until either it is determined at step S324 that the set number of iterations has been performed 
or it is determined at step S326 that there has been no increase in accuracy of the affine fundamental matrix in the 
previous na/2 iterations. 

10 [0211] At step S327, CPU 4 converts the stored affine fundamental matrix (that is, the most accurate calculated using 
the user-identified points alone) into three physical variables describing the camera transformation, namely the mag- 
nification, "m", of the object between the two images, the axis, $, of rotation of the camera, and the cyclotorsion rotation, 
9, of the camera. (The variables $ and 9 will be described in greater detail later.) The conversion of the affine fundamental 
matrix into these physical variables is performed in a conventional manner, for example as described in "Affine Analysis 

is of Image Sequences" by L.S. Shapiro, Section 7, Cambridge University Press, 1995, ISBN 0-521-55063-7. 

[0212] In steps S328 to S345, CPU 4 carries out the affine calculation using pairs of points selected at random from 
the list containing both user-identified points and calculated points. The steps are the same as those performed by 
CPU 4 for user- identified points alone in steps S310 to S327 described above, with the exception that the number of 
iterations, "na", may have been set to a different value at step S224 in Figure 22, and the four pairs of points selected 

20 at random at step S332 are selected from the list comprising both user-identified and calculated points. These steps 
will therefore not be described again. 

[0213] Having performed the affine calculation using pairs of points from the list containing user-identified points 
alone (steps S310 to S327) and using pairs of points from the list comprising both user-identified and calculated points 
(steps S328 to S345) producing an affine fundamental matrix and which is the most accurate for each calculation, at 
25 step S346, CPU 4 compares these two affine fundamental matrices and selects the most accurate, this being the one 
having the highest number of points (stored at steps S322 and S340), and if the number of points is the same, the one 
having the lowest matrix total error. 

[0214] Referring again to Figure 21 , having calculated at step S208 the camera transformation for the first pair of 
images in the triple using the perspective and affine techniques described above, and having calculated at step S210 
30 the camera transformation for the second pair of images in the triple using the same perspective and affine techniques, 
at step S212 CPU 4 uses the results to calculate the camera transformations for all three images in the triple together. 
[021 5] Figure 28 shows the operations performed by CPU 4 in calculating the camera transformations for all three 
images in the triple together at step S212. 

[021 6] When considering all three images in the triple, there are two camera transformations - one from the position 
35 at which the first image in the triple was taken to the position at which the second image was taken, and one from the 
position at which the second image was taken to the position at which the third image in the triple was taken. Each of 
these transformations can be either an affine transformation or a perspective transformation, giving four possible com- 
binations between the images (namely affine-affine, affine-perspective, perspective-affine and perspective-perspec- 
tive). Accordingly, at steps S350, S352, S354 and S356, CPU 4 considers a respective one of the four possible com- 
40 binations, and at step S358 selects the most accurate solution from the four. This processing will now be described in 
greater detail. 

[0217] At step S350 ; CPU 4 considers the case in which the transformation between the first pair of. images in the 
triple is affine, and the transformation between the second pair of images is also affine. Previously, at step S208 (Figure 
21) CPU 4 has already calculated the affine fundamental matrix and associated three physical variables defining the 

45 affine transformation between the first pair of images in the triple. Similarly, at step S210 (Figure 21 ) CPU 4 has cal- 
culated the affine fundamental matrix and associated three physical defining the affine transformation between the 
second pair of images in the triple. As noted previously, the three physical variables derived from an affine fundamental 
matrix do not fully define the movement of the camera between a pair of images. At step S350, CPU 4 uses the 
previously calculated three physical variables to calculate the parameters necessary to define fully the camera move- 

50 ment between each pair of images. 

[0218] Figures 29a and 29b illustrate the parameters which it is necessary to calculate at step S350 to define fully 
the camera movements. Figure 29a shows a CCD imaging device, or film, 50 on which the images are formed in three 
different locations and orientations, representing the locations and orientations at which the first, second and third 
images in a triple were taken. Lines 52 represent the optical axis of the camera 1 2. The optical axis 52 moves a distance 

55 dl in moving from the first position to the second position, and a distance d2 in moving from the second position to the 
third position. 

[0219] The rotation of CCD 50 between the imaging positions is decomposed into a rotation about the optical axis 
52 and a rotation about an axis parallel to the image plane. This is known as the "KvD decomposition 0 and is described 
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0 5* stoT 3 TZ» l£? 9 S h Tr S by LS - ShaPifa APP9ndiX D ' Cambrid 9 e ^rsity P^. 1995, ISBN 
F aura" 22 t ihl » ^ ° P " Cal 8x15 iS known as the ^cWomlon angle" and is represented by "6" in 
Figure 29a In the example shown ,n Figure 29a, CCD 50 rotates by an angle 61=90° from a -landscape' orientation 

S ^H rO T n h ab0Ut 'c 6 3X18 Para " el f ° ,he ima9e P ' ane iS decom P°sa° in an axis-angle formulation into two 
angles * and p. as shown ,n Figure 29b. + defines the axis 54 within the image plane about which rotation occurs* 

asTe t^ng,^ ^ " ** mm * ,h ™* ^ the *£XZ£X 

EL Hlf f f r !f!T. P ?J"! n , 0f th9 Camera r0,a,ion in, ° threa an 9' 9S is a PP |ied to the transformation of the camera 
between the f ret and second images ,n each tripie (these angles being referred to as 81, $1 P 1) and between the 
second and third images (these angles being referred to as 62 *2 p2) Between the 

stoDS~S208 andS2iO f-R^Tim h " Tk " 2 ' emain Undef ' ned by ,he affine ^"damenta. matrices calculated at 
steps 5208 and S210 (Figure 21 ) and must be calculated at step S350 

dTL^^T*? tra " S ' 0r,T,ati0n betWee " 8 P8ir ° f ima9es is a Perspective transformation, the values of p. 
d, 9, * are already defined ,n the rotation matrix and translation vector calculated at step S208 or S210 (Figure 21) 

nZ7,^T SC f H, kn °T AcCOrdin ^ at ste P 3352 when CPU 4 considers the affine-perspecWe case rt is 
necessary to calculate the scale, s, and pi. At step S354, when CPU 4 considers the perspective affine case it s 
necessary to calculate the scale, s, and p2. At step S356. when CPU 4 considers the P erspectSe4e™^e case 
it is necessary to calculate only the scale s H perspecuve case, 

Klu J'ofTca^XXr era,i0nS P8r,0,TT,ed ^ CPU 4 " StSpS S35 °' S352 ' 5354 and 5356 when <« n 9 
[0225] Referring to Figure 30, at step S380, CPU 4 takes the next value of p 1 , P 2. Figures 31a-31 d show the values 
of P 1 , p2 considered by CPU 4 in the different cases at steps S350 to S356 

[0226] Figure 31a shows the value of P 1 , P 2 for the affine-affine case considered at step S350 where both pi and 

SS S TL Val r ° f , p1 ' P2 COnS ' dered < C ° mpriSin 9 eiQht va,ues * P' trying between iS and 
X- ^JfST h 8l9h1 Va ' UeS 0< P2 Va,yin9 betWee " 1 °° and 45 ° in ste P s ° f 50 o» P1 and P 2 between 

succei iva i™ 0One, ^ red j ' has been fou " d lh a> a user is most likely to move camera 12 in this range between 

SSSTtacTsSrS 9 " Ima9es of objecl 24 are taken A wider (or narrower) ran9e of values can ' ° f 

SShe se S co e dl b m S ^T ,h 7 alUe , S ° f P1 ' P2 the affine -P ers P ec,iv e considered at step S352. In this case, 
o 1 nt^ T h T '! rans,orma, ' on 18 Perspective, the value of P 2 is known, and therefore different values of only 

and Tin sfepsT?" ^ e ' 9ht Va ' UeS °* p1 ^ «" * he kn ° Wn ValUe °' ^ ***** between ^ 

[0228] Figure 31c shows the values of p1, p2 considered for the perspective-affine case considered at step S354 
r^LLrLf TT* ,rans orma,ion is Parspective, the value of pi is known, and therefore eight values of P 2 are 
considered for the known value of p1, varying between 10' and 45° in steps of 5° 

2-?llnJSL aid Sh ° W f ,h 7 a,UeS ° f p1 ' P 2 c °r.sidered in the perspective-perspective case in step S356. In this 
s" e Sl S ^ P6rSPeC,iVe ' ValU6S ° f b0th P1 ^ p2 afe knOW1 ' and h — < his 

Sderedltslep So" ^ 3 °' * S382 ' 4 Ca ' CU ' ateS ^ SCa ' e Whi ° h b6St ,itS the Value ° 1 ^ ■ P 2 

[0231] Figure 32 shows the operations performed by CPU 4 when calculating the best scale in step S382 Referrina 
to F-9ure 32, at step S390, CPU 4 sets the value of a counter to zero, and at step S392 the value of the counte is 
-ncremented by one. At step S394, CPU 4 reads the co-ordinates of the points in the next triple of matched points "hi 

S^toTl^^ " t thfee °' ima96S b6ing COnSidBred ' ff0m the liSt » ed at S218 (?lgu a 
afsten Sk^XST*- uses he appropriate camera transformations (affine or perspective) previously calculated 

a raWinffnrttnnff ^ > *. ^T™* M ° con,i 9 ura,to " * •» "™ges in the triple, and then to project 
fhe ool 1 0^ LTkT T' J? ^ IT' 8 * S,6P 3394 ,hr0U9h ,he °P tiCal centre ° f the camera (this being 
no,^ f P U ' arly d,s P laced from ,he centre of the image plane by the focal length of the camera) 
[0232] Figure 33 illustrates the rays projected from each point in the triple 

[0233] It is unlikely that any of the rays from the points in the triple will intersect due to inaccuracies in the camera 

To ZTcpITTT T 3208 ° f S21 °' inaCCUraCI6S ' n ma,ChSd P ° in,S ,hemSelVeS Accord'ngTy Lt 
frnm * ^ calculates the camera transformation between the first and second images which makes the ray 

follow^ ' ma9e ' nl8rSeCt ^ fr0m ' irSl ima9S at 3 P ° int 60 This emulation is performed by CPU 4 as 
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a) The sign of p1 is flipped (reversed) if sin(p1 )Xsin((j>1 )>0. This is done because of prior knowledge of the ordering 
of the images. 



b) The rotation matrix, R, is defined from the angles (01, $1, p1) using the equations: 

R = [Z+A/feinp+A/o-cosp)] R Q 



(13) 



10 



15 



M = 



0 0 sin<J) ^ 

0 0 -cos<)) 

-sin<|> cos<J) 0 



R Q = /+Xs/nG+X 2 (1-cos9) 



(14) 



(15) 
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(0 -1 0> 
10 0 
0 0 0) 



(16) 



25 



30 



where I is the identity matrix. 

c) The translation vector, t, from the point position in the two images %, the rotation matrix, R, and the change 
in magnification between the two images, u m", are defined using the equations: 



v) 



(17) 
(18) 



40 



-top top^ 



top^ "'LI ng ht 



t = (h(x-c x )/f, v(y-c y )/f)' 



(19) 
(20) 



45 



top & right 



fibot 



(21) 



\J3 



[0234] Similarly, at step S400, CPU 4 varies the translation of the camera between the second and third images to 
make the ray from the third image intersect the ray 1rom the second image at a point 62. 

[0235] At step S402, CPU 4 uses the ratio of the distance d 62 of the point 62 from the optical centre of the camera 
at its position for the second image, to the distance d 60 ol the point 60 from this optical centre, to adjust the length 
d1 jnj1ia | of the translation vector between the first and second camera positions and the length d2j n j tja | of the translation 
vector between the second and third camera positions, as follows: 
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1/2 



final ~ Ul initial * [ J (22) 



1/2 



ffna/ " "^initial *> j (23) 

lfcSwhr c f h r c Tu 9 J°th Fi9Ure f 3 ' ,h ! ,!! n9thS d1,lnal and d2 * aI Calculated as above are ,he le "9ths of the translation 
th< iSTT rayS 68 ma98S tD Cf0SS 31 SamS P ° int 64 ' CPU 4 the " US " S the resuBin 9 values 
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S^iaVlgureS' * ^ ^ * ^ S4 ° 2 a " triple ^ in ,he list P roduced at 

[023B] Figure 34 shows the operations performed by CPU 4 when testing the scale against all triple points Referring, 

mationTfr^thS 8 ^ 2 °' ?? * "* ^ °' *« < defined * t^SSt ESS 

nations horn those determined at step S208 or S210 in Figure 21, depending upon whether an affine affine affine- 
perspec tive, perspective-affine or perspective-perspective case is being considered) for all three Sages to fcte into 
Zr^ H^ 31 S,SP 8402 (Fi9Ure 32) ThiS iS Perf ° rmed in — entional manner, fo3mpTe by fix ng 
itnnX o th! v ^1 t0 ^ * *' ° P,iCa ' C<5n,re °* ,h8 Camera in i,s second P°^°n (image 2) with 

SSS of ,he oamera in ,his position < the 2 - beinQ pa — * to 

Centre of camera for third image = t £3 (25) 

Rotation of camera for third image = R 23 ^6) 

Centre of camera for first image = -r] 2 x t (27) 

Rotation of camera for first image = R* 2 ^8) 

MdVisZVZT™ t vec ' or f be,wee K n me ima 9 es ^ the subscripts, and is given by Equation 17 above, 

Tquation IS aSve 9 r ° ,at '° n betWB8n the ima 9 es indica,ed by the subscripts and is given by 

[0239] At step S422, CPU 4 sets the value of a variable, P, to zero, and at step S424, reads the next triple of matched 

whl S * T Pr ° dUCed 81 S,6P 5218 (R9Ure 22) At S,ep S426 ' CPU 4 P*» a ray from The Joint °n the file 
which lies .n the first ,mage of the triple through the optica, centre of the camera in the first position, and from he point 

S 21*11T ima96 t °' the t : ip,e ,hrough ,he optical centre of me camera - the * ird ^ 

Lu^uj higure 35 illustrates the projection of the rays at step S426 

E52L? S428 .' C ^ U . 4 CalCU ' a,eS thS mid - p ° int 68 (RgUre 35 > alon 9 the ° f =l^est approach of the rays 
» KS£ST , h?* ' ma9eS ' thlS ' ine ° f Cl0S6St aDpr0ach bein 9 1,16 line ^ is perpendicular to S the 

ray from the first .mage and the ray from the third image, as shown in Figure 35. At step S430 CPU 4 projects the mid 

££St£?Z St8P S4 H 2B h in, ° S6C ° nd image ° f ,ripl9 Th3t * CPU 4 COnnecls ,h9 midX 68 toTe 

a ^1 T ^Trc f 'Z PaSSeS ,hrOUQh thS ° Pti0al 06n,re ° f 1he camera for the second image This produces 
a projected point 70 in the second image (Figure 35) " ^ 

[0242] At step S432, CPU 4 calculates the distance, V, between the projected point 70 in the second imaee and 
me actual point 72 in the second image from the triple of points read at step S424 At step 843? CPU Amines 
whether the distance calculated at step S432 is .ess than a threshold, set at 3 pixels in thta embodiment The doser 
together the projected point 70 and the actual point 72 in the second image, the more closely thiT triple poTnts 
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supports this value for the scale calculated at step S402 (Figure 32). Accordingly, it the distance is below the threshold, 
the calculated scale is considered to be sufficiently accurate, and at step S436, CPU 4 increments the variable P 
representing the number of triple points for which the scale is accurate, notes the points in the triple under consideration 
as being accurate for the scale under consideration, and updates the total distance error (that is, the error for all the 
5 points so far for which the distance calculated at step S432 was deemed to be below the threshold at step S434) with 
the new distance calculated at step S432. The total error is calculated using the following equation: 



where e f is the distance between the projected point 70 and the actual point 72 in the second image for the Vth triple 
of points, this value being squared so that it is unsigned (thereby ensuring that only the magnitude of the distance 
15 between the projected point and the actual point is considered, rather than its direction, too), P being the total number 
of points, and e th being the distance threshold used for the comparison at step S434. 

[0243] On the other hand, if it is determined at step S434 that the distance is not below the threshold, step S436 is 
omitted so that the variable P is not incremented. 

[0244] At step S438, CPU 4 determines whether there is another triple of points in the list generated at step S218 
20 (Figure 22). Steps S424 to S438 are repeated until the processing described above has been carried out for all the 
triple points in the list. At this point, the value of the variable P then indicates the total number of triple points for which 
the calculated scale is sufficiently accurate. 

[0245] Referring again to Figure 32, after testing the scale at step S404 using the method just described, CPU 4 
determines at step S406 whether the calculated scale is more accurate than any currently stored. This is done by 

25 comparing the number of points, P, and the total error stored at step S436 (Figure 34) with the number of points and 
total error for the previously stored best scale so far. The most accurate scale is the one with the largest number of 
points or, if the number of points is the same, the one with the smallest total error. If the newly calculated scale is more 
accurate, then it, the number of points, P and the total error are stored at step S408 to replace the previous most 
accurate scale, number of points, and total error. If it is not, then the previous most accurate scale, number of points, 

30 and total error are retained. 

[0246] At step S410, CPU 4 determines whether the value of the counter incremented at step S392 is less than 20. 
If it is, at step S412 : CPU 4 determines whether there is another triple of points in the list stored at step S218 (Figure 
22). Steps S392 to S412 are repeated until twenty triples of points have been used to calculate the scale (determined 
at step S410) or until all the triples of points in the list stored at step S218 (Figure 22) have been used to calculate the 

35 scale (determined at step S41 2) if the number of triple points is less than 20. The value 20 has been found empirically 
to produce acceptable results for the scale calculation in a reasonable time. 

[0247] Referring again to Figure 30, after calculating at step S382 the best value of the scale for the value of p1 , p2 
under consideration, at step S384, CPU 4 determines whether the solution, that is, the values of p1 , p2, s are more 
accurate than the solution currently stored. Thus, CPU 4 tests whether the latest values p1, p2, s calculated at steps 

40 S380 and S382 have produced more accurate camera transformations than values which were previously calculated 
at steps S380 and S382. This is done by comparing the number of points, P, stored for the current most accurate 
solution and stored for the latest solution at step S408 (Figure 32) and step S436 (Figure 34). The most accurate 
solution is the one with the highest number of points, or the one with the smallest total error if the number of points is 
the same. If the new solution is more accurate than the currently stored solution, then at step S386, CPU 4 replaces 

45 the currently stored solution with the new one. On the other hand, if the currently stored solution is more accurate, it 
is retained. 

[0248] At step S388, CPU 4 determines whether there is a further value of p1, p2 to consider, and steps S380 to 
S388 are repeated until all values of pt, p2 have been processed as described above. Referring to Figure 31 again, 
it will be seen from Figure 31a that steps S380 to S388 will be performed sixty four times for the affine-affine case 

so calculation at step S350 (Figure 28). It would also be appreciated from Figure 31b and Figure 31c that steps S380 to 
S3S8 will be performed eight times for the affine-perspective case calculation at step S352 (Figure 28) and eight times 
for the perspective-affine case calculation at step S354 (Figure 28). Steps S380 to S388 will be performed only once 
for the perspective-perspective case calculation at step S356 (Figure 28) since, as shown in Figure 31 d, only one value 
of p1 , p2 is available for consideration at step S380. 

55 [0249] Referring again to Figure 28, having calculated respective solutions for the camera transformations for the 
affine-affine case at step S350, for the affine-perspective case at step S352, for the perspective-affine case at step 
S354, and for the perspective-perspective case at step S356, at step S358 CPU 4 selects the most accurate of these 
four solutions. This is again done by considering the total number of points, R stored for each solution (step S386 in 
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same number of points, hen the tZ Terror fo lanh IT • ^ the S0 ' U,i ° n is accurate )- » buttons have the 
selected as the most accurate * S °' U,,0n ' S considered - «id the solution with the smallest error is 

calculated camera transformations are sufficient 2L*1 Tlf k 9 , Wh '° h " de,ermines w ^ther the 
S362 CPU 4 determines that S^SS^S ? " C< P ° intS ' R iS lesS than four ' then at S ^P 

if thenumberof poiZ P ^ii^^tJS?^ " s «ly accurate. On the other hand 
are sufficiently accurate andTcessing7o*e^^^^ Camera transformations 

of points P for the most accurate SS^S '"^PS364 CPU 4 determines whether the number 
(Figure 22). If the number of points is g ZtoSto^l^L L ^ T* I" ^ " Sl St ° red at ste ? 3218 
calculated camera transformations furtC to make i It, ™~ d , eterm,nes that there is ™ need to process the 
Pmoa^lhewto^praceS^T^n^^,?" T SmCe they are alread y suffi ^ntty accurate 
matrices, defining the ^relative posS^ 

[0251] If i, is determined a! step 1 ,h „ T , ° tr ' P ' e ° f ima9SS (inCludin 9 scale and P va '^)- 

determines ^^S^^IZ^^^S^ *" 8 ° % ' * S,eP 3366 CPU 4 
determines that the solution should not be opto ised furthlr^nrin Perspectrve-perspective case. If it is, CPU 4 
is converted to full camera rotation M^ln rlT £ pr ° C , essin9 P rocseds to s,e P S370 where the solution 
optimised because th^^^rSS^^ Th ?° utl °" ,or the Perspective-perspective case is no, 
matrix calculated by CPU 4 at step S24o1 F,ou~ pT^T f ^ (haV ' n9 bein9 defined in the '^amenta! 
correspondtothepe'rspective £Jj2?iS t en a,s,epS 3 L Stf'' ■' 1,16 m ° st accurate does no, 

a conventional optimisation method suTas ^^^ml nf . ■ minimises the following function, f(p), using 

* w.h. I. ^ s zxz^^r^T^rs^^Ss" 1 ^ 



f(P) = -P + error (3Q) 



indicates that P is to be maximised and Terror Z t^T 9 " ! 32 a " d S436 ' n F,gure 34 > and ,he m inus sign 
the positive sign indicates EE XT** **" ^ ** ** ^ S,0red " ^ 5436 ( Fi 9 ure 34) and 

[0257] Figure 37 shows the operations performed by CPU 4 in step S450 rIZ , c 

in the second pair of images by a user at telLV^r^V^ I "f °' PaifS °* P ° in,S ***** wera ™*hed 
points comprising the -r-ide^^^ a Pai. of 

second images at steps S54 or S64 in Fiaure 7 fCPl l a r«™JnL , T calculated to be matching in the first and 
above with respect to step S218 in Fiq^e 22) and a I L, 0 M 9 " ^ ^ thiS " S, in ,he manner described 
three images in the triple of images (No e fa 'step S54 or S64 m^' i T ^ ^ ^ *'* ma,Ched aCfOSS a » 

wiilformpadofatripieofpUS;^ 
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step S394, if selected). As noted above with respect to step S218 in Figure 22, the number of user-identified points 
may be zero if affine initial feature matching has not been performed. 

[0258] At step S466, CPU 4 normalises the points in the lists created at step S464, and at step S468, sets up two 
measurement matrices; one for the list of user-identified points and one for the list of user-identified and calculated 

s points. These steps are carried out in the same way as steps S220 and S222 in Figure 22 described above, and 
accordingly will not be described again. At step S470, CPU 4 determines the number of iterations to be performed 
when carrying out the perspective and affine calculations for the second pair of images in the triple. This is performed 
in the same way as step S224 in Figure 22 described above, and accordingly will not be described again. 
[0259] Referring again to Figure 36, having set up the necessary parameters at step S450, at step S452, CPU 4 

10 calculates the camera transformation for the second pair of images in the triple and stores the results. This is carried 
out in the same way as step S208 or S210 in Figure 21 described above, and accordingly will not be described again. 
[0260] At step S454, CPU 4 uses the camera solutions for the first pair of images read at step S460 (Figure 37) 
together with the camera transformation calculated at step S452 for the second pair of images in the triple to calculate 
camera transformations between all three images in the triple. 

is [0261] Figure 38 shows the operations performed by CPU 4 when calculating the camera transformations between 
the three images in the triple at step S454 in Figure 36. These operations are very similar to those performed in step 
S21 2 (Figure 21 ), and described above with respect to Figure 28, when calculating the camera transformations between 
the first three images in the positional sequence. As noted above, the relationship between the cameras for the first 
pair of images in the triple is already known from calculations on the preceding triple. It is therefore necessary to 

20 consider the transformation between only the second pair of images. Accordingly, at step S472, CPU 4 considers the 
case where the transformation between the second pair of images is affine. This is done by considering the camera 
solution for the first pair of images (read at step S450 in Figure 36) together with the most accurate affine fundamental 
matrix calculated for the second pair of images in step S452 (Figure 36), and calculating the scale, s, and p2 using the 
same operations described above with respect to step S354 in Figure 28. 

2S [0262] At step S474, CPU 4 considers the case where the transformation between the second pair of images is 
perspective. CPU 4 uses the calculation for the first pair of cameras read at step S460 (Figure 37) together with the 
most accurate rotation matrix and translation vector for the cameras for the second pair of images obtained in step 
S452 (Figure 36) to calculate the scale using the same operations as in step S356 (Figure 28). In steps S476 to S488, 
CPU 4 carries out processing which is the same as that carried out at steps S358 to S370 in Figure 28, described 

30 above. That is, CPU 4 selects the most accurate solution from the one calculated at step S472 and the one calculated 
at step S474, and determines whether this is sufficiently accurate or not, optimising it if necessary at step S486 (which 
corresponds to step S368 in Figure 28) (it being noted that the solution is not optimised if it is determined at step S484 
that the solution corresponds to the ^-perspective case since the values of p are optimised and, in the perspective 
transformation for the second pair of images, p is already sufficiently accurate since it is defined in the calculated 

35 fundamental matrix ; and the value of p for the first pair of images will either be defined in a fundamental matrix if the 
transformation is perspective or will already have been optimised at step S368 in Figure 28 if the transformation is 
affine). 

[0263] Referring again to Figure 7, a description will now be given of the way in which CPU 4 performs constrained 
feature matching for a triple of images at step S74. 
40 [0264] Figure 39 shows, at a top level, the operations performed by CPU 4 when carrying out constrained feature 

matching. 

[0265] Referring to Figure 39, at step S500, CPU 4 considers "double" points in the first pair of images in the triple, 
that is points which have been matched between the first pair of images at step S52, S54, S60, S62, S64, S72 or S74 
(steps S54, S64 and S74 being applicable if performed for a previous triple of images) in Figure 7, but which have not 
45 been matched between the second and third images in the triple. For each pair of such "double" points, CPU 4 tries 
to identify the corresponding point in the third image. If it is successful, a triple of points, (that is, points matched across 
all three images) is created. 

[0266] Similarly, at step S502, CPU 4 considers "double" points in the second and third images of a current triple 
(that is, points which have been matched across the second pair of images at step S54, S60, S64 or S72 in Figure 7, 
50 but which have not been matched across the first pair of images in the triple) and tries to identify a corresponding point 
in the first image to create new triples of points. 

[0267] Figure 40 shows the operations performed by CPU 4 at step S500 and at step S502 in Figure 39. Referring 
to Figure 40, at step S504, CPU 4 considers the next point in the second (centre) image of the triple which forms a 
"double" point with the other image of the pair (the first image when performing step S500 or the third image when 
55 performing step S502) and uses the camera transformation calculated at step S56 or step S66 in Figure 7 to identify 
a point in a corresponding location in the remaining image of the triple (the third image when performing step S500 or 
the first image when performing step S502). 

[0268] At step S506, CPU 4 calculates a similarity measure between the point in the second image and points lying 
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point in the y direction. Thus, points withinTiuare of five bl fi ^"ant, two pixels) on either side of the identified 
triple. CPU 4 calculates the simitan^ measur ZZat^Z P a T ^V™* lhe remaini "9 the 
* as that described in the paper -Adaptive Leas Sau^r JrT ^ ^ correla,lon technique, for example such 
Gruen, P^ogrammetry^mo^ Ma ^"9 technique- by A.W. 

[0269] At step S510. CPU 4 determines whether ,L =1 , I 9 to ,dent,fy a best matcn " P oin ' 

S506 is greater than a thresho^s ZSSZSS 7? * M P ° in * Mied at S '*P 

matching points, and at step S51 1 forr-s a S o? 9 " * ' S suff,cien1 * hi 9" to consider the points to he 

isnol greater man the threshold, step S51 2 Ts^ 

consideration. P ° S ° that no tn P |e of P 0ln,s ls formed for the double of points under 

processed in the manner described aboTe P f ° f th<5 pa "" ° f ' ma9es bein 9 <=°"sidered have been 

*> S500 in Figure 39) and new matches Swee S in me iTo^T of a < ri P |e °< ^ges (step 

These new matches are used by CPU 4 to qene a e hi th If if ?!f ° f the ,r ' P ' e (step S502 in R 9<" e 39). 

described below. In addition, ho Jever, ^^u^t Z^T * ^ S1 ° " R9Ure * 35 Wi " be 
pair of images in a triple are taken into account during, ?h J™ T v f 9 enerat ^ between points in the second 
This is because, as explained prav?^^^^!^!" f 1 ^ 6 matChin9 th ° 06X1 ,fiple of ima 9<* 
matches for the second pair of images in S^SS! h 9 ' S ° Ut * S,ep 874 to identifv °™ 

of images considered, and both the LtanJt" 1 _ -™ 9eS DeCOmes ,he firsl P air <* "™9es in the next triple 

matching performed at steo S64 artsmnt ,„ m JJI ™ _ J " ? P er,ormed at step S54 and the affine initial feature 
previously been matched across the firs, pai cTS ^ °' im " eS in ,he ,riple which have 

images calculated during constrained feature^Sfefep fIT "Sof" ° e,Ween 9010(8 in the first ? air °' 

performing initial feature matching for the nexS of iml ae f th^ n 9 ,aken inl ° °°" s «era,ion when 

4 generates the three-dimensional data at step S10 in Roum 3 a Z blT ^T^" ^ aCC ° Unt When CPU 
matching is carried out at step S74 in Figure 7 for th* ZTZ descnbed below. When constrained feature 

of images to be considered, ^accord^^ 

are not taken into consideration during 2 f^uTmSna f!Zt!f ^ SeC ° nd ^ " in the ,ri P |e 

formations (step Island perming SSSSSSSS! TTS ^ ^ Ca ' CUlalin9 ,hS Ca ™ a trans ' 
uses the results to generate 3D data a, step S 10 The aim o ?SSSi > 1 ^ a °° Ve ' CPU 4 

three-dimensional space correctly positioned Uc 4^11^ w ' S ° 9en8fa,e 3 Single se) ° f P°i"ts in a 
[0274] Figure 41 shows the operations perfo3bv C^U 71 ' ODjeCt 24 

Referring to Figure 41 at step S520, CPU A Sders each oJi^ 9enera, ' n9 ^ 3 ° ^ 8t S,ep 810 in Fi 9 ure 3 
Figures 2 and 5. the pairs comprising UU ^ ^ ^^Z***- T* h ^ (in me 9Xam P ,e °' 

a user-identified "double" o, points (that is a pair of 1^,= UL t 5) ' a " d P rc J ec,s P ol ™s within the pair which form either 
S60 or S72 in Figure 7 but not matehed 2h a c 0 ^nt in the ZT T « » the »™ * step 

pair of images, or part of a trip.e of pofnts St ^ou^Zo!^t^ TV" 9 °' '°"°wing the 
or by CPU 4, between the images in the pair and b^oTl 9 ( f ' P ° ,n,S Wh,ch are m ^ched, either by a user 
the positional sequence) to Jc ula te a sing l£!^££S "T ? ° "* *" ^ * 
considers only pairs of matched points which i) wl considered c 2 I T-T °' POi " tS S ' ep S520 > CPU 4 
transformation when this transformation to be 8^c,enl.y accurate with the calculated camera 

po,n,s when constrained feature matching was performed ^tep SB or mw" ' ^ W6r6 id9nWied 38 neW matchin 9 
rom a pair to a triple during constrained feature ZZlna J^oisVr o I" Pal> ° f P ° in,s extended 

feature matching which were not considered to be sufficient *™ ' P ° in ' S matChed dUrin 9 init ial 
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in three-dimensional space through the optical centre of the camera lor that point. This produces rays similar to those 
shown in Figure 35, with the exception that the rays are projected from adjacent images in Figure 35 since the images 
are considered in pairs. 

[0276] At step S534, CPU 4 calculates the mid-point of the line segment which connects, and is perpendicular to, 
both the lines projected in step S532 (this mid-point corresponding to the point 68 shown in Figure 35, and representing 
a physical point on the surface of object 24). At step S536, CPU 4 determines whether a corresponding point has been 
matched in the next image of the sequence, that is, whether the points from which rays were projected in step S532 
form part of the triple of points with the subsequent image. If it is determined that a corresponding point has been 
matched in the next image, CPU 4 projects a line from the matched point in the next image in the same way that it did 
from the points in step S532. At step S540, CPU 4 calculates the mid-point of the line segment which connects, and 
is perpendicular to the new line projected at step S538 and the line projected from the point in the previous image at 
step S532, in the same way that the mid-point is calculated in step S540. 

[0277] At step S542, CPU 4 determines whether a corresponding point has been matched in the next image of the 
sequence. Steps S538 to S542 are repeated until the next image in the sequence does not contain a corresponding 
matched point or until all the images in the sequence have been processed. 

[0278] By way of example, referring to a sequence of images containing five images, such as the example shown 
in Figure 2 and Figure 5, steps S532 and S534 will project a ray from a point in the first image and a matched point in 
the second image and calculate a single three-dimensional point (the mid-point in step S534) which represents the 
projection of the point in the first image and the point in the second image. Thus, a single point in three-dimensional 
space representing a physical point on the surface of object 24 is obtained from a pair of points between adjacent 
images in the sequence. If the third image in the sequence contains a point which is matched to those in the first and 
second images (determined at step S536), steps S53B and S540 project a line from the point in the third image and 
calculate the mid-point of the line segment which connects, and is perpendicular to, the line from the point in the second 
image and the line from the point in the third image, this mid-point representing the 3D point resulting from the projection 
of the points in the second image and third image. Similarly, if the fourth image in the sequence has a point matched 
to that in the third image (determined at step S542), steps S538 and S540 are repeated to project a line from the point 
in the fourth image and calculate the mid-point of a line segment which connects, and is perpendicular to, the line from 
the fourth image and the line from the third image. A further 3D point representing the projection of points from the 
fourth and fifth images in the sequence will be obtained by step S538 and S540 if it is determined at step S542 that a 
corresponding point has been matched in the fifth image of the sequence. Thus, if the point is matched in all five images 
of the sequence, four 3D points are produced (representing the same physical point on the surface of object 24), 
although it is unlikely that the 3D position of these will be exactly coincident due to errors in the calculated camera 
transformations and the matches themselves. Instead, the points form a cluster 80 in 3D space, as shown in Figure 43. 
[0279] Referring again to Figure 42, at step S544, CPU 4 determines whether there is another pair of points not 
previously considered in the current pair of images which form a user-identified "double" of points across the pair of 
images or form part of a triple of points with a subsequent image. Steps S532 to S544 are repeated until all such points 
have been considered. Each such pair of points produces either a single point 82 in 3D space (Figure 43) if it- is 
determined at step S536 that a corresponding point has not been matched in the next image or a cluster of points if 
the corresponding point has been matched in at least the next image. If the point is matched across three successive 
images in the sequence, the cluster contains two points, if it is matched across four successive images in the sequence 
it contains three points, and, as described above , if it is matched across five images in the sequence, the cluster 
comprises four points as shown in cluster 80 of Figure 43. 

[0280] At step S546, CPU 4 considers whether there is another pair of images in the sequence. Steps S532 to S546 
are repeated until ail pairs of images in the sequence have been processed as described above. The result is a plurality 
of clusters of points in three-dimensional space as shown in Figure 43, with the points within each cluster corresponding 
to what should be a single 3D point (this representing a point on the surface of object 24). 

[0281] Referring again to Figure 41, at step S522, CPU 4 uses the 3D points calculated at step S520 to calculate 
the error in the transformation previously calculated for each camera, and to identify and discard inaccurate ones of 
the 3D points. 

[0282] Figure 44 shows the operations performed by CPU 4 at step S522 in Figure 41. Referring to Figure 44, at 
step S550, CPU 4 considers all of the points in three-dimensional space calculated at step S520 (Figure 41) and 
calculates the standard deviation of the x co-ordinates, Ax, the standard deviation of the y co-ordinates, Ay, and the 
standard deviation of the z co-ordinates, Az. At step S552, CPU 4 calculates the "size" of the object made up of the 
points in the three-dimensional space using the formula: 

Size = (Ax 2 + Ay 2 + Az 2 ) 1 (2 (31 ) 
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P 'n= RP n+l (32) 

The sum is minimised over all common points of the modules of the dot product (Pn'-P^^Pn'-Pc). 

5 [0288] At step S566, CPU 4 applies the error rotation matrix and the error translation vector calculated at step S564 
to each point previously calculated for the subsequent pair of camera positions (#2 in Figure 45b). For each previously 
calculated point, this gives a corrected point (P n ' given by Equation 32 above) which is now positioned closer to the 
point for the current pair of camera positions, as shown in Figure 46, in which the points for the current pair of camera 
positions are represented by dots as before, and the corrected points for the subsequent pair of camera positions are 

10 represented by crosses. 

[0269] At step S568, CPU 4 calculates the difference between the co-ordinates of each corrected 3D point calculated 
at step S566 and its corresponding point, and calculates the co-variance matrix of the resulting differences, this being 
performed using conventional mathematical techniques. The resulting co-variance matrix comprises a Gaussian dis- 
tribution in three dimensions, which represents a three-dimensional error ellipsoid for the error transform calculated at 

is step S564. Thus, in steps S564 to S568, CPU 4 has calculated an error transform for the subsequent pair of camera 
positions and the error (the error ellipsoid) associated with the error transform. 

[0290] At step S570, CPU 4 determines whether there is another pair of camera positions which has not yet been 
considered. Steps S554 to S570 are repeated until the data for all pairs of camera positions has been processed in 
the manner described above. 

20 [0291 ] It will be appreciated that an error transform is not calculated at step S564 for the first pair of camera positions 
in the sequence. This pair of camera positions is assumed to have zero error. It will also be appreciated that the error 
transform for a given pair of camera positions is calculated relative to the previous pair of camera positions. Thus, the 
error transform for the second pair of camera positions {that is, producing the second and third images in a sequence) 
includes no cumulative error since the error for the first pair of camera positions is assumed to be zero. On the other 

2S hand, the error transform for each subsequent pair of camera positions will include cumulative error. For example, the 
error transform for the third pair of camera positions (that is, the positions producing the third and fourth images in the 
sequence) is calculated relative to the error transform for the second pair of camera positions. Accordingly, the calcu- 
lated error transform and co-variance matrix for the third pair of camera positions needs to be adjusted by the error 
transform and co-variance matrix for the second pair of camera positions to give a total, cumulative error for the third 

30 pair of camera positions. Similarly, the calculated error transform and co-variance matrix for the fourth pair of camera 
positions (producing the fourth and fifth images in the sequence) needs to be adjusted by the error transform and co- 
variance matrix for both the second pair of camera positions and the third pair of camera positions (that is, the cumulative 
error for the third pair of camera positions) to give a total, cumulative error for the fourth pair of camera positions. 
[0292] This is carried out by CPU 4 at step S572 as follows: 

35 

*i=*n*i (33) 

40 (34) 



(35) 



where Rj' is the rotation matrix for the ith cumulative error transform, Rj is the rotation matrix for the ith individual error 
transform, V is the translation vector for the ith cumulative error transform, tj is the translation vector for the ith individual 
error transform, Cj' is the covariance matrix for the ith cumulative error transform, and C n is the covariance matrix for 
thenth individual error transform. 

[0293] Referring again to Figure 41 , after calculating the error for each pair of camera positions at step S522, at step 
S524, CPU 4 adjusts the co-ordinates of each remaining point in the three-dimensional space (that is, the points cal- 
culated at step S520 less those discarded at step S560 in Figure 44) by the appropriate camera position error. This is 
done by applying the cumulative error transform (calculated previously at step S572 in Figure 44) to the point position 
and adding the appropriate error ellipsoid (also previously calculated at step S572 in Figure 44) to the point. For ex- 
ample, points produced at step S520 from the first pair of images in the sequence are not adjusted at step S524 since, 
as described above, it is assumed that the camera position error is zero for this pair of images. The points produced 
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used to create the ^TeZTsr^Z^T^ ^ *" « ** ^ '» ,h9 liSt ** 

[0300] Referring again to Figured a^r pedoSg steps s£ZS£^tT*? 

sJJnssr spa r F each °^ ^ re,a,es to a p ° = i t p 2 r uced a p,ura% o< >** 

Sis o R f ?b;ect 9 2r n l ° F ' 9Ure 3 ' 31 St9P CPU 4 PraC8889S ,h9 p *«« 9 ~ -rfaces, representing the 

R°eZgXu 4 re ^SS^'SE? n'V * ^ 98nera ' in9 ,h8 S,8p 512 in Fi9 "- 3. 

space in a convention m«nn.r f«r Periorms a ueiaunay triangulation of the points in the three-dimensional 

Chapter 10, M.TPress, ISBNO^oeTs^ 

space which originated from a point matched in thelmaqe datalor that 111 POIn, ,ha three «™en S ional ' 
camera and the 3D point CPU 4 stops the rav at thJT«f J, T ^ T " pro ' ectln 9 the ra V "etween the 

Press Professional 1990 ISBN 0-12 prrirr q niloT*.* ^ napter 7 of Graphics Gems" by A. Glassner, Academic 
otherwise me "mera would not be abfe to see the ol a" ?T ™ ^ P ° int and ,he camera ' 

^%rbyr-S's C ™ 
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S600 are repeated until all of cameras have been considered to remove surfaces as described above. 
[0304] In the processing described above, at step S594, CPU 4 projects the ray from a camera to the edge of the 
error ellipsoid for a point (rather than to the point itself) and considers whether the ray intersects any surface. This 
provides the advantage that the positional error for a point is taken into account. For example, if the ray was projected 
5 all the way to a point, a surface lying between the point and the edge of its error ellipsoid nearest to the camera would 
be intersected by the ray and hence removed. However this may produce an inaccurate result since the 3D point could 
actually lie anywhere in its error ellipsoid and could therefore be in front of the surface. The processing in the present 
embodiment takes account of this. 

[0305] At step S602 : CPU 4 considers the remaining triangular surfaces, and removes any which does not have a 
io surface touching free space (this corresponding to a surface which is enclosed within the interior of the object). This 
is performed using a conventional technique, for example as described in 'Three-Dimensional Computer Vision" by 
Faugeras at Chapter 10, MIT Press, ISBN 0-262-06158-9. 

[0306] After performing steps S590 to S602, CPU 4 has produced a plurality of surfaces in a three-dimensional space 
representing the object 24. At steps S604 to S610, CPU 4 determines the texture to be displayed on each triangular 
is surface. 

[0307] At step S604, CPU 4 calculates the normal to the next remaining triangle (this being the first remaining triangle 
the first time step S604 is performed). At step S606, CPU 4 calculates the dot product between the normal calculated 
at step S604 and the optical axis of each camera to identify the camera which viewed the triangle closest to normal 
(this being the camera having the smallest angle between its optical axis and the normal to the surface). At step S608, 
20 CPU 4 reads the data for the camera identified in step S606 (previously stored at step S1 8 in Figure 4) and reads the 
image data lying between the vertices of the triangle to determine the texture for the triangle. At step S610, CPU 4 
determines whether there is another remaining triangle for which the texture is to be determined. Steps S604 to S610 
are repeated until the texture has been determined for all triangles 

[0308] Referring again to Figure 3, in this embodiment, after generating the surfaces representing the object at step 
25 si 2, CPU 4 displays the surfaces at step SI 4. This is performed in a conventional manner, for example as described 
in "Computer Graphics Principle and Practice" by Foley, van Dam, Feiner & Hughes, Second Edition, Addison -Wesley 
Publishing Company Inc., ISBN 0-201-12110-7. This process is summarised below 

[0309] Figure 50 shows the operations performed by CPU 4 is displaying the surface data at step S1 4. Referring to 
Fjgure 50, at step S620, CPU 4 calculates the lighting parameters for the object, that is the data defining how the object 

30 is to be lit. This data may be input by a user using the input device 1 4, or, alternatively, default lighting parameters may 
be used. At step S622, the direction from which the object is to be viewed is defined by the user using input device 14. 
[031 0] At step S624, the vertices defining the planar triangular surfaces of the object are transformed from the object 
space in which they are defined into a modelling space in which the light sources are defined. At step S626, the 
triangular surfaces are lit by processing the data relating to the position of the light sources and the texture data for 

35 each triangular surface (previously determined at step S608). Thereafter, at step S628, the modelling space is trans- 
formed into a viewing space in dependence upon the viewing directed selected at step S622. This transformation 
identifies a particular field of view, which will usually cover less than the whole modelling space. Accordingly, at step 
S630, CPU 4 performs a clipping process to remove surfaces, or parts thereof, which fall outside the field of view 
[031 1] Up to this stage, the object data processed by the CPU 4 defines three-dimensional co-ordinate locations. At 

40 step S632, the vertices of the triangular surfaces are projected to define a two-dimensional image. 

[0312] After projecting the image into two dimensions, it is necessary to identify the triangular surfaces which are 
"front-facing", that is facing the viewer, and those which are "back-facing" , that is cannot be seen by the viewer. There- 
fore, at step S634, back-facing surfaces are identified and culled. Thus, after step S634, vertices are defined in two 
dimensions identifying the triangular surfaces of visible polygons. 

45 [0313] At step S636, the two-dimensional data defining the surfaces is scan-converted by CPU 4 to produce pixel 
values, taking into account the data defining the texture of each surface previously determined at step S608 in Figure 49. 
[031 4] At step S638, the pixel values generated at step S636 are written to the frame buffer on a surface-by -surface 
basis, thereby generating data for a complete two-dimensional image. 

[0315] At step S640 : CPU 4 generates a signal defining the pixel values. The signal is used to generate an image 
so of the.object on display unit 18 and/or is recorded, for example on a video tape in video tape recorder 20. The signal 
may also be transmitted to a remote receiver for display or recording. 
[0316] Various modifications are possible to the embodiment described so far. 

[0317] In the embodiment above, as described with reference to Figure 2, camera 12 is moved to different positions 
about object 24 in order to record the images of the object. Instead, camera 12 may be maintained in a fixed position 
55 and object 24 moved relative thereto. Of course, the positions of the camera 12 and the object 24 may both be moved 
to record the images. 

[031 8] Camera 1 2 may be a video camera recording a continuous sequence of images of the object 24. Image data 
for processing by CPU 4 may be obtained by selecting frames of image data from the video sequence. 
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'™, 19 ' ,he emb °<*™nt above, when arranging the positional sequence of the images at steos S2? an n «km , 
instead, .hi u2ml aijn 2 toitSr? ^ ,heir "° siti ° ns in ■» 

it's s S U S re 6 r? n9 ,6 7 Ch ; iqUeS be u " * th/onesT/c above which are performed 

» ?52and?54' wntchTb^ on "T^' ^ ini,ia ' ,8a,Ure ™*nin g technique performed at st™s 

described in "Computer and Robot Vision Volume 1" by Haralick TIZS^ Ch^tlTlZ 'T^' t f chnl ^ es 
maximum points and saddle points ' at ,8, Corner polnts ' mimm . um P 0 ^, 

C^S'i J^!^Th 0di ? , f n1 \ b0Ve ' Wh8n P erformin 9 affi ™ initial feature matching at steps S62 and S64 in Figure 7 

[0324] In the embodiment above, when performing affine initial feature matchinq at steo S16? ppi j a 

ouu? a 0 ™o?? ICU,a , tin ? Camera lransfofmati °n« ^eps S56 and S6S in the embodiment above CPU4carries 
nnl l Pe ^° ,IVe ca,cute,,on twice (Figure 25) - once using user-identified points alone (steps S246 to si^Tnd 
one using both user-identified and CPU-calculated points (steos S266 to SPftPi £™^ZVn,V? * 

- once using user-identified points alone and once using CPU-calculated points alone or 

- once us,ng CPU-calculated points alone, and once using both user-idenLd and CPU-calcu.a.ed points 

tests^he ohvsicrfunrr \ T"^', ^ M « n9 Camera transformation a, step S240, CPU 4 

xesTS tne physical fundamental matrix (steps S253, S255 S273 and S?7^ in Fir,,,^ i ♦ ^ J 7 
reansabte matrix (such as the physica, essentia, matr* ^ ^ ' nStead ' ^ 

ind ^ln^c en Pe omTj n9 constrained feature etching in the embodiment above (step S74 in Figure 7) in steos S500 
taruen, Photogrammetry Remote Sensing and Cartography, 1985 oaaes 175-187 L h« <. cq h t ^ * 
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with a point in the third image, thereby identifying a triple of points. 

[0329] In the embodiments described above, when performing affine initial feature matching on a pair of images at 
step S62 or S64 in Figure 7, CPU 4 considers points in the first image of the pair which have been matched with points 
in the preceding image in the sequence but which have not yet been matched with a point in the second image of the 

5 pair, and performs processing to try to match such points with points in the second image of the pair (steps S166 to 
S176 in Figure 18). Thus, CPU 4 performs processing to "propagate" matched points through the sequence of images 
from a current image to a succeeding image in the sequence. It is also possible to perform such processing to 'prop- 
agate" points in the opposite direction, that is, from a current image to a preceding image in the sequence. For example, 
the images in the sequence could be considered in reverse order, that is, starting with the final image in sequence (the 

10 image taken at position L5 in the example of Figure 2), and the data processed in a similar manner to that already 
described. Processing can also be performed to "propagate" points in both directions, this being likely to provide more 
matches between points than when processing is performed to "propagate" points in a single direction. This, in turn, 
may enable more accurate camera transformations to be calculated at step S66 in Figure 7. 

[0330] In the embodiment above, when CPU 4 performs constrained feature matching at step S74 in Figure 7. new 

is matches between points in the second and third images of a triple of images may be identified at step S500 in Figure 
39. As explained previously, these points are considered in subsequent processing since the pair of images across 
which the new points are matched becomes the first pair of images in the next triple of images considered. Thus : when 
automatic initial feature matching or affine initial feature matching for the second pair of images in the next triple is 
performed at step S54 or step S64, the new matched points from the constrained feature matching may be used to 

20 identify matching points in the third image of the triple, as described above. On the other hand, in the embodiment 
above, the new matches generated at step S502 in Figure 39 between points in the first and second images of a triple 
when CPU 4 performs constrained feature matching are not considered in any subsequent initial feature matching 
operations. This is because the new matches are across the first pair of images in the triple, and this pair is not con- 
sidered further in subsequent initial feature matching processing. The new matches are, however, taken into account 

25 when CPU 4 generates the 3D data at step S1 0 (Figure 3) since the newly matched points form part of a "triple" points. 
As a modification, it is possible to perform additional processing to recalculate the camera transformations taking into 
account any new matches identified during constrained feature matching. This would produce two solutions tor the 
camera transformations for each triple of images: the first being produced in the manner described above with respect 
to Figure 7, and the second being produced by the additional processing to take into account the new matches. The 

30 most accurate solution between the two may then be selected. 

[0331] In the embodiment described, in steps S52, S54, S60, S62, S64, S72 and S74 points (corner points, minimum 
points, maximum points, saddle points etc.) are matched in the images. However, it is possible to identify and match 
other "features", for example lines etc. 

[0332] At step S528 in the embodiment above, CPU 4 merges points if they lie within one standard deviation of each 
35 other. 

[0333] However, it is possible to delete one of the points instead of combining them. 

[0334] In the embodiment described, having gene rated the surfaces at step S12in Figure 3. CPU 4 performs process- 
ing to display the surface data at step 14. Alternatively, or in addition, instead of displaying the surface data at step 
S1 4, CPU 4 may: control manufacturing equipment to manufacture a model of the object 24, for example by controlling 

40 cutting apparatus to cut material tothe appropriate dimensions; perform processing to recognise the object, for example 
by comparing it to data stored in a database; carry out processing to measure the object, for example by taking absolute 
measurements to record the size of the object, or by comparing the model with models of the object previously generated 
to determine changes therebetween; carry out processing so as to control a robot to navigate around the object; transmit 
the object data representing the model to a remote processing device for such processing (for example, CPU 4 may 

45 transmit the object data in VRML format over the Internet, enabling it to be processed by a WWW browser). Of course, 
the object data may be utilised in other ways. 

[0335] The techniques described above can be used in terrain mapping and surveying, with the three-dimensional 
data being input to a geographic information system (GIS) or other topographic database for example. 

50 

Claims 

1. In an image processing apparatus having a processor for processing input signals defining images of an object 
taken from a plurality of undefined camera positions, a method of processing the input signals to produce signals 
55 defining matching features in the images, the method comprising the steps of: 

(a) identifying matching features in the images using a first technique; 

(b) calculating the camera positions using identified matching features; 
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(c) determining the accuracy of the calculated camera positions; and 

(d) if the accuracy of the calculated camera positions is below a' threshold: 

S ' hyTht V Sr ,h and ma9eS '° * ^ S, ° rin9 Si9 " alS def ' n ' n9 matChin9 <6atUreS iden,ified in ,he irna 9 es 

" ideSd byThTuser^" 19 fea1UreS * techni ^ e and the matchin 9 f ^tures 

o 2 ' £g m *rne^ 

3 ' i?r^^I!^i;? t0 , Cta,m ^ "fi 81 " 12 ' ^«™«"the«condtechniquooomprtae8dlvidinQeach imaae into regions 
ZZZ «„ h ,„ 68 T™" by the US6n Ca ' CU,atin9 ,he '^nsformation of corresponding regions between 
.mages, and identrfymg matching features within corresponding regions using the calculated transformations 

4 ' iT? °rtT~ T di ?H 10 PreCedin9 Claim ' Wher6in S,ep (b) inC,udes e*«Win9 the relative position of the camera 
optical centre ror trie images. 

^ 5. A method according to any preceding claim, further comprising the steps of: 

(e) calculating the camera positions using at least some of the matching features identified by the user or 
calculated using the second technique; . 

(f) determining the accuracy of the camera positions calculated in step (e)* and 

(g) if the accuracy determined in step (f) is below a threshold, repeating steps (d) to (f) until the accuracy is 
equal to, or above, the threshold. y 

6. A method according to claim 5, wherein step (e) comprises: 

calculating the camera positions using features from a first set of matching features and 
calculating the camera positions using features from a second set of matching features. 

7. A method according to claim 6, wherein: 

the first set of matching features comprises features identified by the user and 

the second set of matching features comprises either (i) matching features identified using the first technique 
or ^ second technique or (ii) matching features identified using the first technique or the second technique 
together with matching features identified by the user. 

8 - o A pS°^ 

9. A method according to any preceding claim, wherein the number of features identified by the user is less than the 
number of further matching features identified using the second technique and the features identified by the user. 

10 ' LT.k'* acc ° rdin! » to a °y P'eceding claim, wherein the input signals define images of the object taken from at 
least three undefined camera positions. ndl 

11. A method according to any preceding claim, wherein the matching features comprise matching points. 

12 ' iZ^Mhf C ° rdi H 9 10 , a 7 PreC , edin9 C ' aim ' fUrth6r corn P risir >9 ,he of processing signals defining at least 
2„ T T T BS ,de " 1ified by 1he USSr ° r by USi " 9 ,he ^cond technique to generate object data 

defining a model of the object in a three-dimensional space. 

13. A method according to claim 12, further comprising the step of processing the object data to generate imagedata. 

14. A method according to claim 13. further comprising the step of displaying an image of the object. 

15. A method according to claim 13 or claim 14, further comprising the step of recording the image data. 
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16. A method according to any of claims 12 to 15, further comprising the step of transmitting a signal conveying the 
object data. 

17. A method according to any of claims 12 to 16, further comprising the step of recording the object data. 

s 

18. A method of operating an image processing apparatus to process image data comprising images of an object 
taken from a plurality of imaging positions of undefined relationship, so as to identify corresponding object features 
in the images, the method comprising: 

10 identifying features using a first technique; 

determining the relationship between the imaging positions using the identified features; 
testing the accuracy of the determined relationship and, if it is not sufficiently high: 

(i) identifying features on the basis of user-input signals; and 
'5 (jj) identifying further features using a second technique and using the features identified in step (i). 

19. An image processing apparatus for processing input signals defining images of an object taken from a plurality of 
undefined camera positions to produce signals defining matching features in the images, comprising: 

20 (a) means for identifying matching features in the images using a first technique; 

(b) means for calculating the camera positions using identified matching features; 

(c) means for determining the accuracy of the calculated camera positions; and 

(d) means for, if the accuracy of the calculated camera positions is below a threshold, identifying further match- 
ing features in the images using a second technique and matching features identified by a user. 

25 

20. Apparatus according to claim 19, wherein the first technique comprises processing the input signals to identify 
matching comers in the images. 

21. Apparatus according to claim 19 or claim 20, wherein the second technique comprises dividing each image into 
30 regions in accordance with features identified by the user, calculating the transformation of corresponding regions 

between images, and identifying matching features within corresponding regions using the calculated transforma- 
tions. 

22. Apparatus according to any of claims 19 to 21, wherein means (b) includes means for calculating the relative 
35 position of the camera optical centre for the images. 

23. Apparatus according to any of claims 19 to 22, further comprising: 

(e) means for calculating the camera positions using at least some of the matching features identified by the 
40 user or calculated using the second technique; and 

(f) means for determining the accuracy of the camera positions calculated by means (e); 

the apparatus being controlled such that, if the accuracy determined by means (f) is below a threshold, the 
operations performed by means (d) to (f) are repeated until the accuracy is equal to, or above, the threshold. 

45 

24. Apparatus according to claim 23, wherein means (e) comprises means for: 

calculating the camera positions using features from a first set of matching features; and 
calculating the camera positions using features from a second set of matching features. 
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25. Apparatus according to claim 24, wherein 



the first set of matching features comprises features identified by the user; and 

the second set of matching features comprises either (i) matching features identified using the first technique 
55 or the second technique or (ii) matching features identified using the first technique or the second technique 

together with matching features identified by the user. 

26. Apparatus according to any of claims 23 to 25, wherein means (e) includes means for calculating the relative 
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position of the camera optical centre for the images. 

27. Apparatus according to any of claims 19 to 26, wherein the number of features identified by the user is less than 
the number of further matching features identified using the second technique and the features identified bv the 

$ user. 7 

28. Apparatus according to any of claims 19 to 27, wherein the input signals define images of the object taken from 
at least three undefined camera positions. 

10 29. Apparatus according to any of claims 1 9 to 28, wherein the matching features comprise matching points. 

30. Apparatus according to any of claims 19 to 29, further comprising means for processing signals defining at least 
some of the matching features identified by the user or by using the second technique to generate object data 
defining a model of the object in a three-dimensional space. 



15 



31 . Apparatus according to claim 30, further comprising means for processing the object data to generate image data. 

32. Apparatus according to claim 31 , further comprising means for displaying an image of the object. 

33. A storage device storing instructions for causing a programmable processing apparatus to perform a method ac- 
cording to any of claims 1 to 18. 

34. A signal for causing a programmable processing apparatus to perform a method according to any of claims 1 to 1 8. 

25 35. In an image processing apparatus having a processor for processing input signals defining images of an object 
taken from a plurality of undefined camera positions, a method of processing the input signals to produce signals 
defining matching features in the images, the method comprising the steps of: 

(a) displaying the images to a user, and storing signals defining matching features identified in the images by 
JO the user; ° 1 

(b) identifying further matching features in the images using the matching features identified by the user 

(c) calculating the camera positions using at least some of the matching features identified in step (a) or step (bV 

(d) determining the accuracy of the calculated camera positions; and 

(e) if the accuracy of the calculated camera positions is below a threshold, repeating steps (a) to (d) until the 
accuracy is equal to, or above, the threshold. 
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36. A method according to claim 35, wherein step (b) comprises dividing each image into regions in accordance with 
eatures identified by the user, calculating the transformation of corresponding regions between images, and iden- 
tifying matching features within corresponding regions using the calculated transformations. 

. 37. A method according to claim 35 or claim 36, wherein the number of features identified by the user in step (a) is 
less than the number of further matching features identified in step (b) using the features identified by the user. 

45 38 ' t?J?^*? C Tf lng T ° f C,aimS 35 t0 37 ' Where ' n the lnpUt S,9na,s define jma 9 es of the ob i e * taken from at 
least three undefined camera positions. 

M ' £™2 0d ? CC ° rdin ? l ° *"* of claims 35 to 38 ' ^^ein step (c) includes calculating the relative position of the 
camera optical centre for the images. 

so 40. A method according to any of claims 35 to 39, wherein the matching features comprise matching points. 

41 ' «rr 9 ,0 , a 7 ° f ^l 35 t0 401 fUfther com P risi "9 the °» processing signals defining at least 

Z » n! h , 9 7 S ' dentrt,ed m SteP (3) ° r in S,ep (b) to 9 enera,e ob i ect da,a ^fining a moL of the 
object in a three-dimensional space. 

55 

42. A method according to claim 41 , further comprising the step of processing the object data to generate image data. 

43. A method according to claim 42, further comprising the step of displaying an image of the object. 
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44. A method according to claim 42 or claim 43, further comprising the step ol recording the image data. 

45. A method according to any of claims 41 to 44, further comprising the step of transmitting a signal conveying the 
object data. 

5 

46. A method according to any of claims 41 to 45, further comprising the step of recording the object data. 

47. A method of operating an image processing apparatus to process image data comprising images of an object 
taken from a plurality of imaging positions of undefined relationship, so as to identify corresponding object features 

10 in the images, the method comprising: 

(a) processing user-input signals defining matching features in the images to identify further matching features; 

(b) determining the accuracy of the identified further features; and 

(c) if the accuracy is not sufficiently high, repeating steps (a) and (b). 

15 

48. An image processing apparatus for processing input signals defining images of an object taken from a plurality of 
undefined camera positions to produce signals defining matching features in the images, comprising: 

(a) means for storing signals defining matching features identified in the images by a user; 
20 (b) means for identifying further matching features in the images using the matching features identified by the 

user; 

(c) means for calculating the camera positions using at least some of the matching features identified by the 
user or means (b); and 

(d) means for determining the accuracy of the calculated camera positions; 

25 

the apparatus having control means operable such that, if the accuracy determined by means (d) is below 
a threshold, the operation of prompting the user to identify further matching features and the operations performed 
by means (b) to (d) are repeated until the accuracy is equal to, or above, the threshold. 

\. 

30 49. Apparatus according to claim 4B, wherein means (b) comprises means for dividing each image into regions in 
accordance with features identified by the user, means for calculating the transformation of corresponding regions 
between images, and means for identifying matching features within corresponding regions using the calculated . 
transformations. 

35 50. Apparatus according to claim 48 or claim 49, wherein the number of features identified by the user is less than the 
number of further matching features identified by means (b) using the features identified by the user. 

51. Apparatus according to any of claims 48 to 50, wherein the input signals define images of the object taken from 
at feast three undefined camera positions. 

40 

52. Apparatus according to any of claims 48 to 51, wherein means (c) includes means for calculating the relative 
position of the camera optical centre for the images. 

53. Apparatus according to any of claims 48 to 52, wherein the matching features comprise matching points. 

54. Apparatus according to any of claims 48 to 53, further comprising means for processing signals defining at least 
some of the matching features identified by the user or means (b) to generate object data defining a model of the 
object in a three-dimensional space. 

so 55: Apparatus according to claim 54, further comprising means for processing the object data to generate image data. 

56. Apparatus according to claim 55, further comprising means for displaying an image of the object. 

57. A storage device storing instructions for causing a programmable processing apparatus to perform a method ac- 
55 cording to any of claims 35 to 47. 

58. A signal for causing a programmable processing apparatus to perform a method according to any of claims 35 to 47. 
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59 " !aken '^^^^^ ^ * mS ° f *" ' npUt si 9 na,s images. of an object 

taken from a plurality of undefined camera positions, a melhod of processing the input signals to produce signals 
defining match.ng features in the images, the method comprising the steps of: 9 

(a) identifying matching features in the images using a first technique; 

(b) calculating the camera positions using identified matching features' and 

foVmonf h " fUfther matChln9 featUfeS ,n ima " S USln9 a SeC ° nd technique and the ca,cu,ated ca <™a 

LT1°Z T C ° rdin9 i° ?' 3 i m 59 ' WhGrein th9 firSt t6Chnique com P rises Pressing the input.signals to display the 
images to a user, and storing signals defining matching features identified in the images by the user. 

61. A method according to claim 60 : wherein the first technique further comprises processing the input signals to 
identrfy further matching features in the images using matching features identified by the user ° 

62. A method according to claim 61, wherein steps (a) and (b) comprise: 

(i) identifying matching features in the images; 

(ii) calculating the camera positions for the images using the matching features identified in step (i) 
(in) determining the accuracy of the camera positions calculated in step (ii) and 
(iv) if the accuracy calculated in step (iii) is below a threshold: 

" h!f^ ayin9 L maQeS t0 3 US8r ' and St ° ring Slgna,s definin 9 etching features identified in the images 
oy tne user; and a 

- identifying further matching features in the images using a technique different to that used in step (i) and 
the matching features identified by the user. HW 

63. A method according to claim 61 or claim 62, wherein steps (a) and (b) comprise: 

the uier 8 ^" 9 ima9eS '° 3 US6r ' St ° ring Si9nalS definin9 " la,chin 9 features identified in the images by 

ifep C 'm ,i,yin9 fUrther ma, ° hin9 ' eatures in the images usin9 the ma,ching fea1ures iden,ified by the user in 

(3) calculating the camera positions using at least some of the matching features identified in step (1 ) or step (2) 

(4) determining the accuracy of the camera positions calculated in step (3)- and 

(5) if the accuracy of the calculated camera positions is below a threshold, repeating steps (1 ) to (4) until the 
accuracy is equal to, or above, the threshold. 

M ' i^STSfS C °H in9 f ° ^ ° f C ' aimS 61 10 63 ' Wh8rein the firsl 1echni d us comprises dividing each image into 
E ^ r" iden,tfi6d ^ US6r ' Ca ' CUlatin9 ,he '^formation of corresponding regions 
between images, and identifying matching features within corresponding regions using the calculated transforma- 

65. A method according to any of claims 61 to 64, wherein the number of features identified by the user is less than 
the number of further matching features identified using the features identified by the user. 

66 ' tt25U m??'" 9 ,0anV °' C '! imS 59 ,0 65 ' Wh6rein the firsl technic < ue emprises processing the input signals 
to identify matching corners in the images. 1 K«"°>H'iaio 

6? " ln!L h ? d aCCO T" 9 t0 a ? V ° f C ' aimS 59 l ° 66 ' Wherein the second techniQ - ue comprises processing the input 
™h Sth 8 ° 3 firSt ° f th6 ima9eS '° iden,i,y a ,eature within the P art ^ich matches a feature in a 
second of the images, the location of the part within the first image being dependent upon the location of the tea ture 
in the second image and the calculated camera positions. 

68. A method according to claim 67, wherein the feature in the first image is a corner. 

69 ' tZTl~ 0 TT" g l° ^ 0< C ' aimS 59 10 68 ' Wher6in ,he input si 9 nals defins ir " aaes <* *° object taken from at 
least three undefined camera positions. 
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70. A method according to any of claims 59 to 69, wherein the step ol calculating the camera positions includes cal- 
culating the relative position of the camera optical centre for the images. 

71. A method according to any of claims 59 to 70, wherein the matching features comprise matching points. 

5 

72. A method according to any of claims 59 to 71 , further comprising the step of processing signals defining at least 
some of the identified matching features to generate object data defining a model of the object in a three-dimen- 
sional space. 

10 73. A method according to claim 72, further comprising the step of processing the object data to generate image data. 

74. A method according to claim 73, further comprising the step of displaying an image of the object. 

75. A method according to claim 73 or claim 74, further comprising the step of recording the image data. 

15 

76. A method according to any of claims 72 to 75, further comprising the step of transmitting a signal conveying the 
object data. 

77. A method according to any of claims 72 to 76, further comprising the step of recording the object data. 

20 

78. A method of operating an image processing apparatus to process image data comprising images of an object 
taken from a plurality of imaging positions of undefined relationship, so as to identify corresponding object features 
in the images, the method comprising: 

25 identifying features using a first technique; 

determining the relationship between the imaging positions using the identified features; and 

identifying further features using a second technique and the relationship between the imaging positions. 



79. An image processing apparatus for processing input signals defining images of an object taken from a plurality of 
so undefined camera positions to produce signals defining matching features in the images, comprising: 



(a) means for identifying matching features in the images using a first technique; 

(b) means for calculating the camera positions using identified matching features; and 

(c) means for identifying further matching features in the images using a second technique and the calculated 
35 camera positions. 



80. Apparatus according to claim 79, wherein the first technique comprises processing the input signals to display the 
images to a user, and storing signals defining matching features identified in the images by the user. 



40 81. Apparatus according to claim 80, wherein the first technique further comprises processing the input signals to 
identify further matching features in the images using matching features identified by the user 



82. Apparatus according to claim 81 , wherein means (a) and (b) comprise means for identifying the matching features 
and calculating the camera positions by: 

45 

(i) identifying matching features in the images; 

(ii) calculating the camera positions for the images using the matching features identified in step (i); 

(iii) determining the accuracy of the camera positions calculated in step (ii); and 

(iv) if the accuracy calculated in step (iii) is below a threshold: 

50 

displaying the images to a user, and storing signals defining matching features identified in the images 
by the user; and 

identifying further matching features in the images using a technique different to that used in step (i) and 
the matching features identified by the user. 

55 

83. Apparatus according to claim 81 or claim 82, wherein means (a) and (b) comprise means for identifying the match- 
ing features and calculating the camera positions by: 



41 



EP 0 898 245 A1 

(^displaying the images to a user, and storing signals defining matching features identified in the images by 
(2) Wentifying further matching features in the images using the matching features identified by the user in 

2! 7«»!T 9 th ,t Camera P0Si,i ° nS USin9 at least some °» the matching features identified in step (1 ) or step ay 

(4) determining the accuracy of the camera positions calculated in step (3)- and 

(5) if the accuracy of the calculated camera positions is below a threshold, repeating steps (1 ) to (4) until the 
accuracy is equal to, or above, the threshold. K ' 

M - Stac^Sn^ 0 ™ V < °I C ' aimS 81 - t0 83 ' Wh9fein ,h ° firS ' t9Chnique COm P rises dividi "9 ^ imaqe into 
J;::' ° H " ,1 98 ! n,m ° d ° y ,he US8r calculatiri 9 the transformation of corresponding regions 
between images, and identifying matching features within corresponding regions using the calculated transforma- 

M " ,?™hl a ^L n9 ,0 a .T° f T™ 81 t0 84: Wh6rein the " Umber of ,eatures identified b y «» «»« is ^ss than 
the number of further matching features identified using the features identified by the user. 

**' SS^m^ 9 *° ^ ° f ? 79 10 85 ' WhereiP the firSl ,eChnique COmprises Posing the input signals 
to identify matching corners in the images. 

87 ' Sf,"™ °f " 9 * ^ f C ' aimS 79 l ° 86 ' Wh9rGin the SSCOnd techni ^ ue com P rises Pressing the input 

Zfr^Mh 3 ^.k ° 9 firSt ° f thS ima96S 10 identify 3 ,eature within ,he P art m^hes a feature in a 

fn thTtrlnH IT 983 ' w i 003 T ° f Part Wi1hin ,he firSt im " e being de P^dent upon the location of the feature 
in the second image and the calculated camera positions. 

88. Apparatus according to claim 87, wherein the feature in the first image is a corner. 

M " SKE aCCOfdi r 9 ^ anV °' C ' aimS 79 ,0 88 ' Whefein th9 input si9na,s defir,e ima 9 es °< object taken from 
at least three undefined camera positions. 

80. Apparatus according to any of claims 79 to 89, wherein the means for calculating the camera positions includes 
means for calculating the relative position of the camera optical centre for the images. 

81. Apparatus according to any of claims 79 to 90, wherein the matching features comprise matching points. 

92 ' i^%Z 1 1 ° an l 0f C ' aimS 79 '° 91 ' fUrther com P risin 9 mean * Processing signals defining at least 
some of the ,dent,f,ed matching features to generate object data defining a model of the object in a three<Jimen- 
sionai space. 

83. Apparatus according to claim 92, further comprising means for processing the object data to generate image data. 
94. Apparatus according to claim 93, further comprising means for displaying an image of the object. 

95 ' SSHS^T CaUSm9 3 Pra9rammable <~ 9 a PP-<- «° perform a method ac 

96. A signal for causing a programmable processing apparatus to perform a method according to any of claims 59 to 78. 

W " taken ZZu^T* ""TTl * Pr ° C9SS ° r PrOCeSSin9 inpUt Si9na ' S definin9 ima9es °< an ob ie* 

taken from at least three undefined camera positions, a method of processing the input signals to produce signals 

defining matching features in the images and the camera positions, the method comprising the steps of: 

(a) identifying matching features in first and second images of the object 

(b) calculating the camera positions for the first and second images using matching features identified in step 

(c) identifying further matching features in the firs, and second images using the camera positions calculated 
in me step (d); 

(d) matching at least one of the further matching features identified in the second image in step (c) with a 
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feature in a third of the images; and 

(e) calculating the camera position for the third image using the matching feature(s) identified in the second 
and third images in step (d). 

5 98. A method according to claim 97, wherein step (c) is performed using a different technique to step (a). 

99. A method according to claim 97 or claim 98, wherein step (a) comprises processing the input signals to identify 
matching comers in the first and second images. 

10 100. A method according to any of claims 97 to 99, wherein steps (a) and (b) comprise: 

(i) identifying matching features in the first and second images using a first technique; 

(ii) calculating the camera positions for the first and second images using the matching features identified in 
step (i); 

15 (iii) determining the accuracy of the camera positions calculated in step (ii); and 

(iv) if the accuracy calculated in step (iii) is below a threshold: 

displaying the first and second images to a user, and storing signals defining matching features identified 
in the first and second images by the user; and 
20 - identifying further matching features in the first and second images using a second technique and the 

matching features identified by the user. 

101 .A method according to claim 100, wherein the first technique performed in step (i) comprises processing the input 
signals to identify matching corners in the first and second images. 

25 

102. A method according to any of claims 97 to 101, wherein steps (a) and (b) comprise: 

(1 ) displaying the first and second images to a user and storing signals defining matching features identified 
in the first and second images by the user; 
30 (2) identifying further matching features in the first and second images using the matching features identified 

by the user in step (1); 

(3) calculating the camera positions for the first and second images using at least some of the matching features 
identified in step (1) or step (2); 

(4) determining the accuracy of the camera positions calculated in step (3); and 

35 (5) if the accuracy of the calculated camera positions is below a threshold, repeating steps (1) to (4) until the 

accuracy is equal to, or above, the threshold. 

103. A method according to any of claims 97 to 102, wherein step (c) comprises processing the input signals to search 
a part of the second image to identify a feature within the part which matches a feature in the first image, the 

40 location of the part within the second image being dependent upon the location of the feature in the first image 

and the camera positions calculated in step (b). 

104. A method according to claim 98, wherein step (a) comprises processing the input signals to display the first and 
second images to a user and storing signals defining matching features identified in the first and second images 

45 by the user. 

105. A method according to claim 104, wherein step (a) further comprises processing the input signals to identify further 
matching features in the first and second images using matching features identified by the user. 

50 106. A method according to claim 105, wherein step (a) comprises dividing each of the first and second images into 
regions in accordance with features identified by the user, calculating the transformation of corresponding regions 
between the first and second images, and identifying matching features within corresponding regions using the 
calculated transformations. 

55 107.A method according to claim 105 or claim 106, wherein the number of features identified by the user is less than 
the number of further matching features identified using the features identified by the user. 

108. A method according to any of claims 104 to 107, wherein step (a) further comprises processing the input signals 
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to identify matching corners in the first and second images. 
109.A method according to any of claims 97 to 108, wherein the matching features comprise matching points. 

11 °tm!^n« CC r din f l °, °I C ' aimS 97 ,0 1 ° 9 ' WherSin Step (b) includes calculating the relative position of the 
camera optical centre for the first and second images. 

111.A method according to any of claims 97 to 110, wherein step (e) includes calculating the relative position of the 
camera optical centre for the second and third images. position oi me 

11 % A o™oHhr°?h 9 tD , a ? ° f ° lai Ti 97 <0 11 1 " fUrth9 ' <** of Processing signals defining at least 

zzzsz^zsr camera positions ,o 9eneraie obiect da,a definina a modei °' ,he ° b >°« * 

113. A method according to claim 112, further comprising the step of processing the object data to generate image data. 

114. A method according to claim 113, further comprising the step of displaying an image of the object. 

115. A method according to claim 113or claim 114, further comprising the step of recording the image data. 
116 'o'bro t t h rta a . CCOrdin9 '° °' ^ ,0 1 1 5 ' ' Ur,her C ° mpriSin9 ,He Step °' Emitting a signal conveying the 
117.A method according to any of claims 112 to 116, further comprising the step of recording the object data. 
118 '^b!^,r! OP f ra,ln9 a " ima9e pr0CeSSin 9 apparatus to process image data comprising at least three images of 

iISS: ' ma96S ' S ° 38 '° d9termine P ° Siti0nal r9 ' ati0nShiP b9tWee " thS ima96S ' ,he me,hod 

the C Sr re ' a,i0nShiP b6,Ween " SeCO " d ima9eS ^ —ponding features 
^^^^:^ r ' eSPmdinS fea,Ure ln ^ " - — d using the positional 

sss^r^ins in a th,rd of the ima9es which corresponds ,o a ^ ■« * 

SSSSSli ST relationship b9lW9en lhe second and third ima9es usin9 the 

119.A method of operating an image processing apparatus to process image data comprising at least three imaaes of 
an ob,ect and input signals defining the relationship between the positions at which firs, and «Sd?ft2SS2 

ZtleT S ° h S h° T tefmine re,ati ° nShip b6tWeen ,he posi,ions a ' whi <* *e second and a Wrd oHhe 
images were recorded, the method comprising: . 

(a) identifying at least one pair of corresponding object features in the first and second imaqes usina the 
positional relationship defined in the input signals <=«cono images using the 

tnt£?ff£ ^ f6atUre * ^ imaQe WhiCh corres P° nds <° a feature identified in the second 

sssssssi: zt rela,ionship between ,he second and ,hird ima9es using ,he - 

prOC6SSin9 apparatus for processing input signals defining images of an object taken from at least three 
com^^ 

(a) means for identifying matching features in first and second images of the obiect 

l^r ( r ing Camera POSitbnS ^ firS ' and S9C ° nd ima 9 eS usi "9 matehin 9 '^rea iden- 
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(c) means for identifying further matching features in the first and second images using the camera positions 
calculated by means (b); 

(d) means for matching at least one of the further matching features identified in the second image by means 
(c) with a feature in a third of the images; and 

(e) means for calculating the camera position for the third image using the matching feature(s) identified in 
the second and third images by means (d). 

121 .Apparatus according to claim 120, wherein means (c) is arranged to identify matching features using a different 
technique to means (a). 

122. Apparatus according to claim 120 or claim 121, wherein means (a) is arranged to process the input signals to 
identify matching corners in the first and second images. 

123. Apparatus according to any of claims 120 to 122, wherein means (a) and (b) are arranged to perform their oper- 
is ations by: 

(i) identifying matching features in the first and second images using a first technique; 

(ii) calculating the camera positions for the first and second images using the matching features identified in 
step (i); 

20 (Hi) determining the accuracy of the camera positions calculated in step (ii); and 

(iv) if the accuracy calculated in step (iii) is below a threshold: 

displaying the first and second images to a user, and storing signals defining matching features identified 
in the first and second images by the user; and 
25 - identifying further matching features in the first and second images using a second technique and the 

matching features identified by the user. 

124. Apparatus according to claim 123, wherein the first technique performed in step (i) comprises processing the input 
signals to identify matching corners in the first and second images. 
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125. Apparatus according to any of claims 120 to 124, wherein means (a) and (b) are arranged to operate by: 



(1) displaying the first and second images to a user and storing signals defining matching features identified 
in the first and second images by the user; 
35 (2) identifying further matching features in the first and second images using the matching features identified 

by the user in step (1); 

(3) calculating the camera positions for the first and second images using at least some of the matching features 
identified in step (1) or step (2); 

(4) determining the accuracy of the camera positions calculated in step (3); and 

40 (5) if the accuracy of the calculated camera positions is below a threshold, repeating steps (1) to (4) until the 

accuracy is equal to, or above, the threshold. 

126. Apparatus according to any of claims 120 to 125, wherein means (c) comprises means for processing the input 
signals to search a part of the second image to identify a feature within the part which matches a feature in the 

45 first image, the location of the part within the second image being dependent upon the location of the feature in 

the first image and the camera positions calculated by means (b). 

1 27. Apparatus according to claim 121, wherein means (a) comprises means for processing the input signals to display 
the first and second images to a user and storing signals defining matching features identified in the first and 

so second images by the user. 

128. Apparatus according to claim 127, wherein means (a) further comprises means for processing the input signals 
to identify further matching features in the first and second images using matching features identified by the user. 

55 129. Apparatus according to claim 128, wherein means (a) comprises means for dividing each of the first and second 
images into regions in accordance with features identified by the user, means for calculating the transformation of 
corresponding regions between the first and second images, and means for identifying matching features within 
corresponding regions using the calculated transformations 
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130 Apparatus according to claim 128 or claim 129, wherein the number of features identified by the user, is less than 
the number of further matching features identified using the features identified by the user. 

131 .Apparatus according to any of claims 1 27 to 1 30, wherein means (a) further comprises means for processing the 
input signals to identify matching corners in the first and second images. 

132. Apparatus according to any of claims 120 to 131, wherein the matching features comprise matching points. 

133. Apparatus according to any of claims 120 to 132, wherein means (b) includes means for calculating the relative 
position of the camera optical centre for the first and second images. 

134. Apparatus according to any of claims 120 to 133. wherein means (c) includes means for calculating the relative 
position of the camera optical centre for the second and third images. 

135. Apparatus according to any of claims 1 20 to 1 34, further comprising means for processing signals defining at least 
some of the matching features and the camera positions to generate object data defining a model of the object in 
a three-dimensional space. 

1 36. Apparatus according to claim 1 35, further comprising means for processing the object data 1o generate image data. 

137. Apparatus according to claim 1 36, further comprising means for displaying an image of the object. 

138. A storage device storing instructions for causing a programmable processing apparatus to perform a method 
according to any of claims 97 to 11 9. 

139. A signal for causing a programmable processing apparatus to perform a method according to any of claims 97 to 

140 In an image processing apparatus having a processor for processing first input signals defining images of an object 
taken from a plurality of undefined camera positions and second input signals defining matching features in the 
images, a method of processing the first and second input signals to produce signals defining lurther malching 
features in the images, the method comprising the steps of: 

dividing each image into regions on the basis of the malching features defined by the second input signals 

calculating the transformation of corresponding regions between images and 

identifying matching features within corresponding regions using the calculated transformations. 

U1 'L7n£ h d u l C f rdin9 H° Cla i m140 ; wherein each ima 9 e is divided into regions by connecting matching features 
defined by the second input signals to form triangular regions. 

142 f e l e n h ^w C ° rdin9 ,0 C ' aim 1 40 ° f C ' aim 1 41 ' Wher6in ,he Step ° f calcula ^9 the transformation of corresponding 
regions between images comprises calculating a respective transformation for each pair of corresponding regions 

143 ',hnt h0d f aCC ° rdin9 *J* r °' C ' aimS 140 t0 142 ' Wherein the s,e P of dividin 9 each im age into regions includes 
hv thl c pr ° CeSS ', n9 the , flrst and S9Cond in P ut si 9 nals 10 'Entity any edges between matched features defined 

uponr 'Z-ZIZT " ^ ,6aSt ° f ima96S ' ^ COnneC,in9 the ma,Ch8d ,Sa,Ures in 

144.A method according to claim 143, wherein edges beiween matched f eatures in the first image or the second imaae 
are identified on the basis of edge direction values of pixels between the matched features. 9 

145 t^SSH^Sl h° C ' aim , 1 ? T' e,n ed9eS b6,Ween ma1Ched fea,ures in tne ,irst ima 9 e or the second image 
are identified on the basis of edge d^chon and edge strength values of pixels between the matched features 

146 tl^o d ri a i C m C ^ din9 t0 H Cla !? L 44 ° f C,aim M5 ' Whefein lhe ed " S betWBen match9d ,eatures "> the first image or 
the second ,mage are identified by considering only a central portion of the edge and not parts at the ends thereof 

147.A method according to any of claims 143 to 146. wherein the step of dividing each image into regions includes 
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the step of processing the first and second input signals to determine the strength of any edges between matched 
features defined by the second input signals in at least one of the images, and connecting the matched features 
in dependence upon the determined edge strengths. 

5 148. A method according to claim 147, wherein the step of connecting matching features to form regions includes the 
steps of processing the first and second input signals to determine the strength of any edges between matched 
features defined by the second input signals in a first said image and the strength of any edges between matched 
features defined by the second input signals in a second said image, calculating a combined strength measure for 
corresponding edges in the first and second images, and connecting matched features in the first image and 

10 matched features in the second image to form a side of a said region if the calculated combined strength measure 

of the edges therebetween is greater than a threshold. 

149. A method according to claim 148, wherein the combined strength measure for corresponding edges is determined 
by calculating the geometric mean of the strength of the edge in the first image and the strength of the corresponding 

is edge in the second image. . 

150. A method according to claim 148 or claim 149, wherein edges in an image having a combined strength measure 
greater than the threshold are processed to remove cross-overs therebetween, and matched features defining the 
resulting edges are connected to form a side of a said region. 

20 

151. A method according to claim 150, wherein the edges are processed to remove cross-overs by: 

(i) testing the edge with the highest combined strength against each edge of lower combined strength, in order 
of decreasing combined strength, and, if it is determined that the two edges cross, deleting the edge with the 

25 lower combined strength; 

(ii) testing the edge of next highest combined strength which remains against each edge of lower combined 
strength which remains, in order of decreasing combined strength and, if it is determined that the two edges 
cross, deleting the edge with the lower combined strength; and 

(iii) repeating step (ii) until the edge with the next highest combined strength which remains has the lowest 
30 combined strength of the remaining edges. 

152. A method according to any of claims 147 to 151, wherein any three matched features having therebetween two 
edges having a strength greater than a threshold are connected to form a triangular region in the first image and 
in the second image. 

35 

153. A method according to any of claims 140 to 152, wherein, in the step of identifying matching features within cor- 
responding regions, features having an approximately uniform spatial separation in a first of the images are selected 
for matching against features in a second of the images. 

40 1 54.A method according to claim 1 53, wherein the features in the first image are selected by applying a grid to divide 
the first image into areas, and selecting features from the areas. 

155. A method according to any of claims 140 to 154, wherein the input signals define images of the object taken form 
at feast three undefined camera positions. 

45 

156. A method according to claim 155, wherein the step of identifying matching features within corresponding regions 
includes a step of trying to match at least some features in a first of the images already matched with features in 
a second of the images with features in a third of the images. 

so 1 57.A method according to any of claims 1 40 to 1 56, wherein the transformation calculated for corresponding regions 
between images is an affine transformation. 

158. A method according to any of claims 140 to 157, further comprising the step of processing the first input signals 
to generate the second input signals. 

55 

159. A method according to claim 158, wherein the step of processing the first input signals to generate the second 
input signals comprises processing the first input signals to display the images to a user, and storing signals defining 
matching features identified in the images by the user. 
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160.A method according to any of claims 1 40 to 159, wherein the matching fealures comprise matching points. 
161 .A method according to claim 160, wherein the matching features comprise corner points. 

162. A method according to any of claims 140 to 1 51 , further comprising the step of processing signals definino at least 
some of the matching features to generate object data defining a mode, of the object in a llfZZZ^lcl 

163. A method according to claim 162, further comprising the step of processing the object data to generate image data. 

164. A method according to claim 163, further comprising the step of displaying an image of the object. 

165. A method according to claim 1 63 or claim 1 64. further comprising the step of recording the image data. 

166 o A b racrd2a aCC ° rdin9 ,0 ° f C ' aimS 162 1 ° 165 ' ,Urther COmpriSin9 ,he S,e P of totting a signal conveying the 

167.A method according to any of claims 162 to 166, further comprising the step of record.ng the object data. 

16B tZTl°l T,^ ,™ ima " pr0Cessin 9 a PP aratus to P™*ss '"age °ata comprising images of an object 
aken from a plurality of imaging positions of undefined relationship and signals defining corresponding ob Let 
features ,n the images, so as to identify further corresponding features, the method comj^ng ' 

notionally dividing each image into segments on the basis of the corresponding features defined in the input 

determining the mapping of corresponding segments between images and 
identifying corresponding features using the calculated mappings. 

'^In^ZTZ™ 9 aPPa ; atUS f0r P rocessin 9 first in P"t defining images of an object taken from a plurality 

° nn^ h«, POS,t '°u nS S9COnd inpUt Si9nals definin 9 matchi "9 featu ^ the images to produce 

signals defining further matching features in the images, comprising: proauce 

stco!?d 9 in m p! an 4n 3 a r ,s diVidin9 ^ W ° ™ *' *** ° f th6 ma,Chins > ,eatures defined *° 

means f ,or ca,culatin 3 «he transformation of corresponding regions between images; and 
SSSEl y '" 9 ma1Chi09 f6atUreS Within corres P° nd ' n 9 ^9ions using the calculated trans- 

170 Apparatus according to claim 169, wherein the dividing means is arranged to divide each image into regions bv 
connectmg match.ng features defined by the second input signals to form triangular regions. * 

1 71 Apparatus according to claim 1 69 or claim 1 70, wherein the calculating means is arranged to calculate a respective 
transformation for each pair of corresponding regions. respective 

1? ^t P a^fJlt C r din9 - toany0 f C '! imS 169 10 171 ' Wher9in ,he divid ^™*ns includes means for processing the 
at l^ ™ IT 90alS t0 ' den,,fy anV ed " S betWeen ma1ched ,eatures defined ^ the second input stands 
in at least one of the images, and means for connecting the matched features in dependence upon the ideXed 



1 nm S!T^ aCCOrd : n9 '° claim 1 72 ' wherein ed 3 es between matched features in the first image or the second image 
are identified on the basis of edge direction values of pixels between the matched features. 9 

are identified on the basis of edge direction and edge strength values of pixels between the matched features 

175. Apparatus according to claim 173 or claim 174, wherein the edges between matched features in the first image 
or thesecond.mageare identified by considering only a centra, portion of the edge andnot parts at theenSZf 

176. Apparatus according to any of claims 173 to 175, wherein the dividing means includes means for processing the 
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first and second input signals to determine the strength of any edges between matched features defined by the 
second input signals in at least one of the images, and means for connecting the matched features in dependence 
upon the determined edge strengths. 

177. Apparatus according to claim 176, wherein the dividing means includes means for processing the first and second 
input signals to determine the strength of any edges between matched features defined by the second input signals 
in a first said image and the strength of any edges between matched features defined by the second input signals 
in a second said image, means for calculating a combined strength measure for corresponding edges in the first 
and second images, and means for connecting matched features in the first image and matched features in the 
second image to form a side of a said region if the calculated combined strength measure of the edges therebe- 
tween is greater than a threshold. 

1 78. Apparatus according to claim 1 77, wherein the combined strength measure for corresponding edges is determined 
• by calculating the geometric mean of the strength of the edge in the first image and the strength of the corresponding 

edge in the second image. 

179. Apparatus according to claim 177 or claim 178, wherein edges in an image having a combined strength measure 
greater than the threshold are processed to remove cross-overs therebetween, and matched features defining the 
resulting edges are connected to form a side of a said region. 

180. Apparatus according to claim 179, wherein the edges are processed to remove cross-overs by: 

(i) testing the edge with the highest combined strength against each edge of lower combined strength, in order 
of decreasing combined strength, and, if it is determined that the two edges cross, deleting the edge with the 
lower combined strength; 

(ii) testing the edge of next highest combined strength which remains against each edge of lower combined 
strength which remains, in order of decreasing combined strength and, if it is determined that the two edges 
cross, deleting the edge with the lower combined strength; and 

(iii) repeating step (ii) until the edge with the next highest combined strength which remains has the lowest 
combined strength of the remaining edges. 

181. Apparatus according to any of claims 176 to 180, wherein any three matched features having therebetween two 
edges having a strength greater than a threshold are connected to form a triangular region in the first image and 
in the second image. 

182. Apparatus according to any of claims 169 to 181, wherein the identifying means comprises means for selecting 
features having an approximately uniform spatial separation in a first of the images and for matching the selected 
features against features in a second of the images. 

183. Apparatus according to claim 182, wherein the features in the first image are selected by applying a grid to divide 
the first image into areas, and selecting features from the areas. 

1 84. Apparatus according to any of claims 1 69 to 1 83, wherein the input signals define images of the object taken form 
at least three undefined camera positions. 

1 85. Apparatus according to claim 1 84, wherein the identifying means is arranged to try to match at least some features 
in a first of the images already matched with features in a second of the images with features in a third of the images. 

186. Apparatus according to any of claims 169 to 185, wherein the calculating means is arranged to calculate an affine 
. transformation. 

187. Apparatus according to any of claims 169 to 186, further comprising means for processing the first input signals 
to generate the second input signals. 

1 88. Apparatus according to claim 1 87, wherein the means for processing the first input signals to generate the second 
input signals comprises means for processing the first input signals to display the images to a user, and for storing 
signals defining matching features identified in the images by the user. 
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189. Apparatus according «o any o, claims 169 »o 188, wherein the matching features comprise matching points. 

190. Apparatus according to claim 189. wherein the matching features comprise corner points 

192. Appara,usaccording,oc,aim 19 Uur1he^ 

193. Apparatus according to claim 1 92, further comprising means for displaying an image of the object. 

W a^^;j^ZtZ CaUSin9 3 Pr ° 9ramrnable PrOCeSSi " 9 to * -nod 

195.A signal for causing a programmable processing apparatus to perform a method according to any of claims 140 
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CHECK THAT COMBINED 3D POINTS CORRESPOND TO 
UNIQUE IMAGE POINTS AND MERGE ONES THAT DO NOT 



T 



S520 



S522 



S524 



S526 



S528 



Fig. 41. 
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CONSIDER NEXT PAIR OF IMAGES 



v 



P *?f £1 3D LINE FROM EACH POINT IN NEXT PAIR OF Pn.MTC ,m 

noMo,^I PA,R ° F ,MAGES WH,CH FO ™ A USER- ID E NTIF I ED '* * 
DOUBLE OR PART OF A TRIPLE OF POINTS WITH A SUBSEQUENT 

IMAGE 



CALCULATE MID-POINT OF LINE WHICH CONNECTS, AND IS 
PERPENDICULAR TO, BOTH PROJECTED LINES 



HAS A 

CORRESPONDING POINT BEEN MATCHED IN 
NEXT IMAGE? 



S536 



S530 



S532 



S534 



NO 



YES 



PROJECT 3D LINE FROM MATCHED POINT IN NEXT IMAGE 



CALCULATE MID-POINT OF LINE WHICH CONNECTS AND IS ' 
PERPENDICULAR TO, THE NEW PROJECTED LINE ! AND ThI 
PROJECTED LINE FROM THE PREVIOUS IMAGE 



S538 



S540 



HAS A — 
CORRESPONDING POINT BEEN MATCHED IN 
NEXT IMAGE? 



S542 



ANOTHER PAIR OF^^_ $544 
YES , PO,NTS PREVIOUSLY CONSIDERED 

A , '^CURRENT PAIR OF IMAGES WHICH FORM 
A USER-IDENTIFIED DOUBLE OR PART OF A TRIPLE 
OF POINTS WITH A SUBSEQUENT 
IMAGE? 



YES 



S546 



ANOTHER PAIR OF IMAGES? 
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f CONSIDER ALL 3D POINTS, AND CAIXUUVTE THE STANDARD DEVIATION 
I OF THE X t y and Z COORDINATES -> AX, Ay, AZ 





CALCULATE OBJECT SU 


:e= (AX 2 + Ay 2 + AZ 2 ) 1/2 




■ — : ^ 

— - — i 


r 


FOR NEXT PAIR OF CAMERA POSITIONS, CONSIDER NEXT 3D POINT 
ORIGINATING FROM A TRIPLE OF POINTS WITH A SUBSEQUENT IMAGE 
AND CALCULATE SHIFT BETWEEN THIS 3D POINT AND CORRESPONDING 
POINT PREVIOUSLY CALCULATED FOR SUBSEQUENT SrSSSSSJ 

POSITIONS 



S550 
S552 



S554 



IS MAGNITUDE 
OF SHIFT > 10% OF OBJECT SIZE? 



S558 



NO 




^. < L'i LAlrE NET OF SHIFTS BET WEEN POINTS FOR CURRENT PAIR OF 
CAME o^^ T,ONS AND POINTS FGR SUBSEQUENT PAIR OF CAMERA 
T»Ai*KS2St£2? G,VE ERROR ROTATI °N MATRIX AND ERROR 
TRANSLATION VECTOR FOR SUBSEQUENT PAIR OF CAMERA POSITIONS 



ADJUST POINTS EOR SUBSEQUENT PAIR OF CAMERA POS.TIONS USING 
CALCULATED ERROR TO GIVE CORRECTED 3D POINTS 



CAL ?T^ E D 2L F n RENCE BETWEEN EACH CORRECTED 3D POINT AND 

ITS CORRESPONDING POINT FOR CURRENT PAIR OF CAMERA 
POSITIONS, AND CALCULATE COVARIANCE MATRIX (ERROR^ELLIPSOID) 
OF THE DIFFERENCES 



S564 



S566 



S568 



Fig. 44. 
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CALCULATE CUMULATIVE ERROR FOR EACH PAIR OF CAMERA 

POSITIONS 



I 

Fig. 44. 
Cont. 
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Fig.45b. 
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1 





SORT 3D POINTS IN ORDER OF SIZE OF ERROR ELLIPSOID 
(SMALLEST FIRST) 








p, 


r 



S580 



COMPARE NEXT HIGHEST POINT IN LIST WITH ALL 
SUBSEQUENT POINTS AND IDENTIFY ALL SUBSEQUENT 
POINTS FOR WHICH HIGHEST POINT UNDER CONSIDERATION 
IS WITHIN A DISTANCE OF 1 x ITS MAHALANOBIS DISTANCE 



S582 



COMBINE HIGHEST POINT UNDER CONSIDERATION WITH 
EVERY IDENTIFIED POINT TO PRODUCE ONE COMBINED 
POINT. REPLACE HIGHEST POINT UNDER CONSIDERATION 
WITH COMBINED POINT. AND DISCARD IDENTIFIED POINTS 
USED TO CREATE COMBINED POINT 



S584 



YES 



ANOTHER 
POINT IN LIST NOT YET 
CONSIDERED? 



S586 



Fig. 48. 
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PERFORM DELAUNAY TRIANGULATION OF 3D POINTS 



CONSIDER NEXT CAMERA 



Z 



REMOVE ANY SURFACE THE RAY INTERSECTS 



YES 



YES 




REMOVE ALL TRIANGLES WHICH DO NOT HAVE A SURFACE 
TOUCHING FREE SPACE 



CALCULATE NORMAL TO NEXT REMAINING TRIANGLE 



CALCULATE DOT PRODUCT BETWEEN NORMAL AND OPTICAL 

AXIS OF EACH CAMERA AND IDENTIFY CAMERA WHICH 
VIEWED THE TRIANGLE CLOSEST TO NORMAL 



READ TEXTURE FOR TRIANGLE FROM DATA FOR IDENTIFIED 

CAMERA 




S590 



S592 



PROJECT RAY FROM CAMERA TO NEXT 3D POINT WHICH CAN I 

BE SEEN BY THAT CAMERA | S594 



S596 



S602 



S604 



S606 



S608 



Fig. 49. 
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1 



CALCULATE LIGHTING PARAMETERS 



DEFINE VIEWING DIRECTION 



PERFORM LOCAL TRANSFORMATION 



LIGHT SURFACES 



PERFORM VIEW TRANSFORMATION 



CLIP 



PROJECT TO DEFINE IMAGE IN 2-D 



CULL BACKFACES 



SCAN CONVERT TO PIXELS 



J 



r 



WRITE TO FRAME BUFFER 



2-D VIDEO IMAGE 



□ 



S620 

S622 

S624 

S626 

S628 

S630 

S632 

S634 

S636 

S638 

S640 



Fig. 50. 
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